Understanding the Compound File Binary format and OLE structures to mess with CVE-2022-30190

Initially I began this research to generate weaponized RTF files delivering the CVE-2022-30190(Follina) exploit. Why RTF files? Because the payload with RTF files will deliver on (probably) all Windows versions (to the date of writing this report) and can execute by just enabling the preview pane and viewing the RTF document from File Explorer. In contrast the payload does not execute on all Windows versions when loaded from DOCX files.

To generate RTF files containing the exploit I have used Cas Van Cootens POC code to first generate a DOCX file and then create a new copy of the same document just in RTF format, this would create a valid RTF file weaponized with the exploit. What's the problem then? The problem was that every time I wanted to generate a valid RTF file, I had to first generate a DOCX file and then regenerate an RTF file from within Microsoft word. What if I don't have MS Word? What if I'm lazy? Well, I thought to myself – "How hard could it be automating this?" I opened the RTF file saw where my payload was saved in plane text, replaced it and there we go, it should work right? (NOT).

If we take a simple look at how an RTF file can be loaded with a malicious hyperlink we can as specified in this article regarding cve-2017-0199 we can see that -

This was right, as long as the last field – objdata – was loaded with a proper OLE object. Honestly this assumption that I had initially about this implementation was due to my laziness and wishful thinking (Actually assuming that implementing this within Windows would be THAT easy and not convoluted). This error, which was totally missed by me as I did not test the POC properly, prompted a person with a very creative user handle to raise an issue on github for Cas's POC, this issue was raised regarding RTF generation feature I contributed.

MSisfuckedupmanimaginepayingtogetRCEd writes -

And indeed, after further inspection Cas confirmed that -

Apparently(which honestly makes a lot of sense now), when regenerating an RTF file containing a hyperlink to a remote template an OLE object is generated by Office(Actually 2 objects are generated by only one of those two is needed) this can be confirmed by viewing the OLE object within HxD, we find the Compound File Signature as can be seen in this hex blob taken from the beginning of the OLE object stored within the RTF file dumped using oletools.

This prompted me, to learn how OLE Objects are stored and understand how they work so I could automate their creation. One might ask why am I doing this? As generating an RTF loaded with Follina is easy to do, just regenerate it with word? Well… I am annoyed by the Github issue and simply curious. Anyways strap your belts and take your sanity pills because we are about to deep dive into some DEEP Microsoft RFCs and Specifications!

OLE, Compound Binary File Format, COM and Windows theory

First lets examine the RTF file specifications, to understand how RTF stores embedded objects such as files, hyperlinks and other data streams.

Microsoft OLE links, Microsoft OLE embedded objects, and Macintosh Edition Manager subscriber objects are represented in RTF as objects. Objects are destinations that contain a data part and a result part. The data part is generally hidden to the application that produced the document. A separate application uses the data and supplies the appearance of the data. This appearance is the result part of the object.

We can see how this is implemented with an RTF file which I generated using Word which should contain the Follina Payload:

The fields of interest are \objautlink which specifies an auto link object which is essentially a link within the word document that auto executes. According to the RTF specifications the \objupdate should execute it by force but from my own testing this works in an arbitrary way. Finally, the most interesting field is \objdata.

This subdestination contains the data for the object in the appropriate format; OLE objects are in OLESaveToStream format. This is a destination control word.

This is where things start to get a bit convoluted. The payload generated is stored within an OLE object within the RTF file, its an hex encoded object and it looks like this –

To understand what all these hex numerals mean we must first understand what are OLE objects and in format what they are stored. For that we'll use Wikipedia –

OLE allows an editing application to export part of a document to another editing application and then import it with additional content. For example, a desktop publishing system might send some text to a word processor or a picture to a bitmap editor using OLE. The main benefit of OLE is to add different kinds of data to a document from different applications, like a text editor and an image editor. This creates a [Compound File Binary Format] (https://en.wikipedia.org/wiki/Compound_File_Binary_Format) **document and a master file to which the document makes reference. Changes to data in the master file immediately affect the document that references it. This is called "linking" (instead of "embedding"). OLE objects essentially allow File Explorer Addins in your apps, Drag and Drop feature, Links to excel documents within a word document or add GIFS into email messages. OLEs are stored using the Compound File Binary Format (CFBF also named CBF or CFB) which is based of the FAT File System specifications. Yes and if this sounds crazy, OLE Objects use the Component Object Model (COM).

COM is a binary interface which is the basis for a lot of Microsoft Technology, it allows for inter-process communication which allows for Windows objects to be implemented in different environment in which they were created. For example, Word and Excel documents are unrelated but using COM I can either link or embed a useable Excel document file into a word document. The COM technology knows how to do this using its various interfaces.

If I for example wanted to embed an excel file in a Word document and display it to anyone, I would embed an OLE object within the Word document which would either link or have an embedded excel file. This OLE object would contain "instructions" written using the COM Interface for the Word process which would explain how to load this excel file. Word would process the OLE Object and the COM "instructions", Word would then call the COM Interface specified and load the excel document properly into Word. Word does not understand what Excel is but using COM it doesn't need to because the COM interface handles all the heavy duties and allows Word easily to either link to the referenced Excel document or to literally embed an excel document within it.

Just writing this hurts my brain but in summary this picture should explain everything -

OLE and Compound File Binary Format in practice

Lets take a look at the OLE file within the generated RTF file mentioned previously. I took the raw object data stored within it and loaded it into my Hex Editor of choice HxD.

As it was way to hard to read, I created a nice to read diagram that should explain what's going on( It's important to mention that all structures are stored in little endian format ). The first 33 bytes specify the OLE Object header(last 2 bytes are missing from the picture).

Using the MS-OLEDS specification we can interpret that this is an OLE Embedded Object Container. As the FormatID field contains the value 0x2.

The class name field is also quite interesting, it contains the name " OLE2Link" which might give us a few hints about what this OLE Object is meant to do. Finally, after the ObjectHeader we have the value contained in offset 0x1D which is 0x0000A000 which represents the entire stream size of this object. This value is quite important as modifying the OLE Object would require altering this value as well, otherwise the Word process won't read the OLE Object in its entirety.

Following it is the NativeData , this data is actually a Compound Binary File Format which stores (or should store) OLE Objects, Embedded files or documents, links and pictures. This can be confirmed by the first 8 bytes found in the NativeData stream.

According to the MS-CFB this value is the CFB file signature.

What is a Compound Binary File ( CBF )? According to Wikipedia it is the following –

Compound File Binary Format (CFBF), also called Compound File, Compound Document format, or Composite Document File V2 (CDF), is a compounddocument file format for storing numerous files and streams within a single file on a disk. CFBF is developed by Microsoft and is an implementation of Microsoft COM Structured Storage. At its simplest, the Compound File Binary Format is a container, with little restriction on what can be stored within it. A CFBF file structure loosely resembles a FATfilesystem. The file is partitioned into Sectors which are chained together with a File Allocation Table (not to be mistaken with the file system of the same name) which contains chains of sectors related to each file, a Directory holds information for contained files with a Sector ID (SID) for the starting sector of a chain and so on.

Microsoft stores OLE objects within CBF's and COM Objects within those OLEs. Why Microsoft chose CBFs as the main format to store these objects can be read here. This format is mostly replaced by Office Open XML but as I saw its still used within RTF objects and old office extensions such as:

Anyhow, the RTF hyperlink should be stored somewhere within this CBF file so let's traverse it.

General Guidelines about CBFs

CBF files are divided to 512-byte sized sectors. Below I added two tables that should help understanding how the first sector for the CBF file looks like(read about it here). For OLE objects were mostly interested in the Directory sector, the directory sector contains information about OLE data object streams.

At offset 0x30 the DWORD 0x1000000000 can be found which indicates the location of the Directory sector. Since CBFs are stored in little endian format the starting location is one. To calculate the offset of the starting location we follow the formula of (1+DirectoryStartingSectorLocation)*512 which drops us at offset 0x400.

While the CLSID field is interesting as it indicates the type of the COM object associated with the activation of the document(in this case it's the SAX XML Reader 6.0) the more interesting fields are located at offset 0x74 and 0x78. The staring location of the OLE streams is calculated from the Mini stream sector which starts at offset in offset 0x600 in our case using the formula (SSL*0x64) which can be properly viewed using the olebrowse tool from the oletools collection. The stream size field specifies the size of the stream, it will be modified to shrink/increase in accordance with the length of the remote template URI.

The next sector specifies the MiniFAT chain which give information regarding the chained streams within the CFB. Its not a very important sector for the blog but its worthwhile to see how it looks. The chain shows how the streams are linked. Each cell in the chain represents 40 bytes of a stream, it continues up until it reaches the value 0xFFFFFFFE so the first stream goes for 5 blocks of 40 bytes (or 0x140 in hex).

This can be confirmed by reading the first 320 bytes in the last sector.

However, the directory entry specifies the stream size will only use 275 bytes out of the 320 bytes or (0x130)

The OLE Stream and Monikers

As usual, I created a diagram below of the OLE Stream structure. From offset 0x810 within the OLE stream we reach the first moniker stream. A moniker is an object (or component) in Microsoft's Component Object Model ( COM ) that refers to a specific instance of another object. A moniker stream always starts with a CLSID which describes what type of moniker it is, followed by a data stream. There are quite a bit of moniker specifications, read about it here.

This OLE stream "instructs" the COM interface to load and launch the malicious hyperlink. Additionally, to the OLE stream, the LinkInfo stream also contains data regarding the hyperlink and it to needs to be modified.

My hypothesis is, is that if I can somehow control the size of the OLE stream, the LinkInfo stream and their components, I can generate different hyperlinks.

Luckily for me and the reader this blog is the after math of my success of doing so, first let's name the important size fields:

  1. NativeDataSize – This field specifies the size of value of the entire OLE Data object, this value cannot be easily modified unless I modify and reconstruct the MiniFAT chain and the FAT chain.

  2. Directory Stream Size – This field specifies the size of each stream, important to note that most of the streams don't use their entire size limits and are padded with null bytes.

  3. OLE Stream AbsoluteMonikerStreamSize – Th is field specifies the size of the entire HyperLinkMoniker and its components.

  4. URLMoniker length – This field specifies the byte size of the URI string plus 24(really, it's weird I know - but its specified in the URLMoniker specification)

Crafting an OLE Stream

The first problem that needs addressing is the total size of the streams, while I can manually adjust the CFB file myself a much easier solution would be to just generate a very large OLE stream. I just input a different sized port with the value of 65535 instead of the default 80 using Cas Van Cootens POC code for Follina that was mentioned in the beginning of the blog and this in turn generates a very large stream. This time the NativeDataSize contains the value 0xC00, which is 512 bytes larger than previously (just one sector larger).

Additionally, the MiniFAT chain is much larger now.

So, this marks the problem as solved.

Next, I decide to look at the RootEntry stream size field. Currently its set to 0x142 (322) bytes.

But I know for a fact that its padded with null bytes to align with the MiniFat sector size specification of 64 bytes.

So essentially, this stream size can be increased to 0x180 (384) bytes. The same can be implemented for the LinkInfo object.

The current LinkInfo stream size is 0xf0 (240) but can be increased to **0x240 (576) bytes! This is as simple as just changing the values within the blob.

The final two problems are quite easy to solve. With some simple math to calculate the total size of the objects, subtract the modified URL fields from the original size and pad what's left with zeros in the appropriate locations.

By doing this, I'm essentially just modifying the objects and not changing the stream alignment! This in fact works perfectly! Here is a demonstration of Follina where one VM in a VLAN serves the payload at a typically long URI and another VM in the VLAN retrieves the payload using the RTF script.

You can read the python code here!


That's it! I hope you enjoyed reading this and learned something about OLE Objects and Microsoft Magic! See you next time!