What Is YARA?

YARA is a powerful and versatile tool that helps you identify and classify files based on their content. Imagine you have a bunch of files and you want to find all the ones that contain a specific pattern, like a certain string of text or a particular sequence of bytes. YARA lets you create custom rules that match these patterns and then scan your files to find any matches. It’s like having a super-powered search engine to find exactly what you’re looking for, even if it’s hidden deep inside a file. YARA is great for malware researchers, security professionals, and anyone else who needs to analyze files quickly and efficiently.

What Are YARA Rules?

YARA rules are a set of instructions that you create to help YARA identify files based on their content. Each YARA rule consists of two main parts: a set of conditions describing the pattern you’re looking for and actions to take when that pattern is found.

The conditions part of a YARA rule can include things like strings, regular expressions, or byte sequences, and they define the specific content that YARA should search for. The actions part of a YARA rule can include things like printing a message, executing a command, or setting a flag to indicate that a match was found.

Installing YARA

First things first, let us install YARA onto our system using the following command.

Note: I am using a Debian-based system in the following demonstrations. Please adjust the installation command according to your Linux distribution.

┌──(N3NU㉿kali)-[~]
└─$ sudo apt-get install yara

If you have issues installing YARA, try updating the package manager with the following command first.

┌──(N3NU㉿kali)-[~]
└─$ sudo apt-get update

Creating Our First YARA Rule

Before we can use the YARA tool to analyze files, we need to make sure we have two things required to run the tool. First, we must have a file to analyze, and second, we need to create YARA rules to use against the file we wish to analyze.

Let us first begin by creating a simple text file containing the text “Learning is fun!”

┌──(N3NU㉿kali)-[~]
└─$ echo "Learning is fun!" > file.txt

Now it is time to create our first YARA rule! Create a new file and name it first_yara_rule.yar. As you can tell by the naming convention, YARA files use a .yar file extension.

┌──(N3NU㉿kali)-[~]
└─$ touch first_yara_rule.yar

Using your favorite text editor, mine is nano, open the newly created “first_yara_rule.yar” file and input the following code.

┌──(N3NU㉿kali)-[~]
└─$ nano first_yara_rule.yar
rule funny
{
condition: true
}

Let us break down what the code is doing inside the first_yara_rule.yar file. The first line of the code creates a new rule called funny, this is the identifier. Identifiers are case-sensitive and cannot start with a numerical value or include spaces. The code within the curly brackets states the condition of the rule. If a file meets the condition criteria then the rule will apply to the file. In this case, the condition is set to true, meaning that every file will match.

I know what you are thinking, this particular YARA rule is not useful in real-world file analysis since it applies to every file, for this demonstration let us pretend that it is.

Analyzing Files Using the YARA Command

We now have the data required to run the YARA command we installed two sections prior.

To analyze a file using the YARA command, use the following format.

yara <yara-rule> <target-file>

If we used the two files we created as arguments and execute, we should see the following.

┌──(N3NU㉿kali)-[~]
└─$ yara first_yara_rule.yar file.txt
funny file.txt

The first column of the output is the name of the rule that found a match, in this case, it was funny. The second column of the output contains the name of the target file, file.txt, which met the conditions of the rule. Remember, since the condition for the rule funny was set to true, any target file would have matched. Let us now take a look at creating a rule that matches a file of a specific type.

Creating YARA Rules to Identify PDF Files

Before we begin creating a rule to identify whether or not a file is a PDF, let us first obtain a benign PDF to work with. If you already have a PDF then you are good to go, but if you do not, we will use some Google Dorking to retrieve one.

Head on over to Google and search for learning filetype: pdf, see figure 1 below. The search essentially searches for PDF files related to learning. If you want to learn more about Google Dorking, check out my write-up here. I chose to use the PDF file found in the first result of my search, your results may be different.

Note: I am conducting my demonstrations in a safe environment. Please be vigilant when downloading ANYTHING from the internet.

Figure 1: Searching for a PDF

Once you have obtained a PDF file, create a file called pdf_file.yar and put the following code inside of it.

rule pdf_checker
{
meta:
author = "N3NU"
description = "Checks whether or not a file is truly a PDF."
}

Looking at the code above, we are already familiar with the rule identifier, but what is the other stuff between the curly brackets? This is the metadata of the YARA rule, used to provide additional data about the rule and authorship.

In order to identify if a file is a PDF, we will use strings common to a PDF file to help us build the conditions of the rule. We can assign strings to variables and make conditions based on whether or not the variable exists within the file in question. Here is an example of the declaration of the variable and value assignment.

$variable = "value"

As you can see, variables are identified by the $ symbol.

The first line of a PDF file is known to start with %PDF- followed by a version number. The last line of a PDF file contains %%EOF. We can use this knowledge to check if our PDF file is legitimate. We can do so by running the strings command against our PDF file.

Note: The name of the PDF file I am using is finalreport.pdf. Yours can have a different name and be a completely different PDF file.

┌──(N3NU㉿kali)-[~]
└─$ strings finalreport.pdf

After running the command above, we can see the first and last lines of the PDF file captured in figures 2 and 3 below, respectively.

Figure 2: Highlighting the First Line of a Benign PDF File
Figure 3: Highlighting the Last Line of a Benign PDF File

Since we have now confirmed our PDF file contains the appropriate strings, we will now add a strings section to our pdf_file.yar file.

rule pdf_checker
{
meta:
author = "N3NU"
description = "Checks whether or not a file is truly a PDF."
strings:
$start = "%PDF"
$end = "%%EOF"
}

Notice how the new strings section sits below the meta section. Our YARA rule is almost complete, we just need to add the condition section, which is a mandatory section. The condition section is added in the following example.

rule pdf_checker
{
meta:
author = "N3NU"
description = "Checks whether or not a file is truly a PDF."
strings:
$start = "%PDF"
$end = "%%EOF"
condition:
$start at 0 and $end
}

So, what exactly is going on in the condition section? We are stating that the $start variable is expected to be at offset 0, the first line of the file. The and operator signifies that the $end variable must also be within the file.

Running our YARA rule against our PDF file we can see that we get a successful match, shown in figure 4 below.

Figure 4: Checking the Legitimacy of a PDF File Using YARA Rules

What if we try to run the pdf_file.yar rule against the file.txt file we created some time ago? By doing so, we will not get a match since the structure of the file.txt file is not the same as that of a PDF file. Figure 5 below proves this by not displaying any output, signifying that there is no match.

Figure 5: Checking if file.txt Contains Patterns Belonging to a PDF File

Creating YARA Rules With Hexadecimal Values

We will now be taking a look at creating a rule based on a hexadecimal pattern. Once again, the file.txt file created earlier will be used as our target.

To pull hexadecimal patterns from file.txt we need to use a tool such as xxd. Here is what that would look like.

┌──(N3NU㉿kali)-[~]
└─$ xxd file.txt
00000000: 4c65 6172 6e69 6e67 2069 7320 6675 6e21 Learning is fun!
00000010: 0a .

We are interested in copying the first line of hexadecimal values which is made up of 8 individual 16-bit values in hexadecimal, seen in figure 6 below.

Figure 6: Converting file.txt to Hexadecimal

Using our hexadecimal values, let us create a new rule within the pdf_file.yar file titled my_file. The contents of pdf_file.yar should now look like the following.

rule pdf_checker
{
meta:
author = "N3NU"
description = "Checks whether or not a file is truly a PDF."
strings:
$start = "%PDF"
$end = "%%EOF"
condition:
$start at 0 and $end
}

rule my_file
{
strings:
$bytes = {4c65 6172 6e69 6e67 2069 7320 6675 6e21}
condition:
$bytes
}

We can see that we now have a $bytes variable holding the hexadecimal values between curly brackets to indicate that they are in fact hexadecimal values and not plain digits. The condition section holds the $bytes variable meaning that a file must contain these hexadecimal values in order to match the my_file rule.

What will happen if we run the file.txt file against pdf_file.yar now? Let us find out. See the output in figure 7 below.

Figure 7: Checking if file.txt Matches Against the Rules in PDF_file.Yar

As expected, file.txt does not match the first rule, pdf_checker, but we already knew that. The file.txt file does match the second rule, my_file, because as we know, the file contains the hexadecimal pattern that matches the $bytes variable within the rule.

Conclusion

In conclusion, YARA is a powerful tool that can be used to detect and analyze malicious files and data. By creating YARA rules, cybersecurity analysts can automate the process of identifying and classifying threats, making their job more efficient and effective. With the ability to search for specific patterns and strings of data, YARA can be used to detect known malware, zero-day exploits, and other types of cyber threats.

I hope you have enjoyed learning about YARA and creating YARA rules. Subscribe for more tools, tips, and tricks to add to your arsenal.

Until next time…

N3NU

Disclaimer: My content is for informational and educational purposes only. You may try out these hacks on your own computer at your own risk.

--

--