Spooky simple python tricks

Spooky simple python tricks
Photo by Chris Ried / Unsplash

Whether or not you are a programmer, being able to explore and manipulate data is a super power.

Python is an amazing programming language. Python is simple, concise and flexible. It is especially useful to quickly solve real problems. There is a reason that Python is the language of ML and AI.

Here are some simple python tricks to quickly format data, encode/decode various data types, and quickly automate exploring CISA KEV list.

REPL

For starters, interactivity is key for exploring data and quickly iterating on ideas. The Python REPL makes exploring and playing with data a breeze.

With Python installed on most systems using the REPL is as simple as typing python into your terminal.

$ python3
Python 3.11.8 (main, Feb 13 2024, 09:03:56) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Once in the REPL you are executing python. Just type the python and hit enter. It couldn't be more simple

>>> 1+1
2
>>> print("hello world")
hello world

HELP!

Don't worry Python is here to help. The help() function will show you documentation and be a useful reminder if you forget any of the syntax.

>>> help()
Welcome to Python 3.11's help utility! If this is your first time using
Python, you should definitely check out the tutorial at
https://docs.python.org/3.11/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To get a list of available
modules, keywords, symbols, or topics, enter "modules", "keywords",
"symbols", or "topics".

Virtual environments

You will likely need to install some additional packages that can make your life easier. Don't over think this we are hacking and exploring data. VirtualEnvs are a perfect solution for experimenting and quickly installing packages in a partially isolated environment.

python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install requests rich

Pretty Printing

If you are going to be looking at data in the shell... you need to make the output pretty. Thankfully we have the amazing rich library. Rich has many features to make your script output beautiful. Luckily for us rich has the pretty module. Once installed in the REPL Python data structures will automatically be pretty printed with syntax highlighting.

>>> from rich import pretty
>>> pretty.install()
>>> x = {'a': 1, 'b': 2}
{'a': 1, 'b': 2}

HTTP Client

The world runs on HTTP. Being able to make quick simple HTTP requests and parse the data is its own super power. Maybe you would like to explore the CISA Known Exploited Vulnerabilities (KEV) list.

The python requests module is designed to make simple HTTP requests well... simple. The requests tag line is Built for human beings.

Here we make an HTTP request to download the CISA KEV list and parse the response as JSON with the .json() call. In Python JSON is translated to the built-in dictionary data type. For our purposes a dictionary is a set of key: value pairs. The .keys() method on a dictionary shows us the top level keys. We then get the values for the title and dateReleased keys.

>>> import requests
>>> r = requests.get(
 "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
 ).json()
>>> r.keys()
dict_keys(['title', 'catalogVersion', 'dateReleased', 'count', 'vulnerabilities'])
>>> r['title'], r['dateReleased']
('CISA Catalog of Known Exploited Vulnerabilities', '2024-03-18T17:37:00.9876Z')

>>> r['vulnerabilities'][-1]
{
    'cveID': 'CVE-2024-27198',
    'vendorProject': 'JetBrains',
    'product': 'TeamCity',
    'vulnerabilityName': 'JetBrains TeamCity Authentication Bypass Vulnerability',
    'dateAdded': '2024-03-07',
    'shortDescription': 'JetBrains TeamCity contains an authentication bypass vulnerability that allows an attacker to perform admin actions.',
    'requiredAction': 'Apply mitigations per vendor instructions or discontinue use of the product if mitigations are unavailable.',
    'dueDate': '2024-03-28',
    'knownRansomwareCampaignUse': 'Unknown',
    'notes': 'https://www.jetbrains.com/help/teamcity/teamcity-2023-11-4-release-notes.html'
}

Finally lets build a list of all of the CVEs on the CISA KEV list, get a count of how many cves we found and look at the first 5 entries to sanity check.

>>> cves = [i['cveID'] for i in r['vulnerabilities']]
>>> len(cves)
1089
>>> cves[0:5]
['CVE-2021-27104', 'CVE-2021-27102', 'CVE-2021-27101', 'CVE-2021-27103', 'CVE-2021-21017']

Encoding and decoding data

Data can be represented in many ways and thus comes in all shapes and sizes. Being able to convert between data types is essential for your new super power. Let's look at the very basics.

[WARNING] encoding is not encryption
"bWFuaXB1bGF0aW5nIGRhdGEgaXMga2V5"

"6d616e6970756c6174696e672064617461206973206b6579"

[
    '0b1101101', '0b1100001', '0b1101110', '0b1101001', '0b1110000', '0b1110101',
    '0b1101100', '0b1100001', '0b1110100', '0b1101001', '0b1101110', '0b1100111',
    '0b100000', '0b1100100', '0b1100001', '0b1110100', '0b1100001', '0b100000',
    '0b1101001', '0b1110011', '0b100000', '0b1101011', '0b1100101', '0b1111001'
]

base64 is one of the most common schemes allowing you to encode binary data into printable characters. Base64 is so common it is built into the Python standard library, the docs are here.

Lets take a base64 value and convert it to binary and then back to base64. Take note in Python the underscore _ is used to represent the output from the previous line. This is super useful if you forget to assign the output of your command.

>>> from base64 import b64encode, b64decode
>>> x = "bWFuaXB1bGF0aW5nIGRhdGEgaXMga2V5"
>>> b64decode(x)
b'manipulating data is key'
>>> b64encode(_)
b'bWFuaXB1bGF0aW5nIGRhdGEgaXMga2V5'

Another way to represent binary data as printable characters is hexidecimal. This type of representation is often seen used in tools like hexdump

Here is how you can convert from a hexidecimal encoding to binary and back.

>>> from binascii import hexlify, unhexlify
>>> unhexlify("6d616e6970756c6174696e672064617461206973206b6579")
b'manipulating data is key'
>>> hexlify(_)
b'6d616e6970756c6174696e672064617461206973206b6579'

With all this talk of binary, lets look at how you can use a list comprehension to see what the world in binary.

First we create a list of binary values called x. On the next line we convert each item in the list to an integer with the call to int() and convert that integer into a character with the char() function. Finally we join all the characters into a single string.

>>> x = [
...     '0b1101101', '0b1100001', '0b1101110', '0b1101001', '0b1110000', '0b1110101',
...     '0b1101100', '0b1100001', '0b1110100', '0b1101001', '0b1101110', '0b1100111',
...     '0b100000', '0b1100100', '0b1100001', '0b1110100', '0b1100001', '0b100000',
...     '0b1101001', '0b1110011', '0b100000', '0b1101011', '0b1100101', '0b1111001'
... ]
>>> ''.join([chr(int(i, 2)) for i in x])
'manipulating data is key'
>>> [bin(ord(i)) for i in _]
>>>  ['0b1101101', '0b1100001', '0b1101110', '0b1101001', '0b1110000', '0b1110101',
     '0b1101100', '0b1100001', '0b1110100', '0b1101001', '0b1101110', '0b1100111',
     '0b100000', '0b1100100', '0b1100001', '0b1110100', '0b1100001', '0b100000',
     '0b1101001', '0b1110011', '0b100000', '0b1101011', '0b1100101', '0b1111001']

As we saw earlier a simple way to interact with JSON objects in Python is to convert them to the dictionary data structure. What if you want to save the dictionary locally instead of reading it from a HTTP request?

Here we save or dump a Python dictionary as a JSON string into the file example.json and then read or load that same JSON string from a file converting it back to a dictionary.

The magic here is the call to the open() function which creates a readable stream the json module can then use to write or read the file contents.

>>> import json
>>> json.dump({'a': 1, 'b': 2}, open('example.json', 'w'))
>>> json.load(open('example.json'))
{'a': 1, 'b': 2}

Yet another markup language is YAML. As it is designed to be human readable YAML is often used for configuration files and is the defacto language of DevOps. Thanks to the PyYaml Framework it is easy convert YAML to Python dictionaries and store the objects as files.

First we need to install the PyYaml module into the virtualenv

pip install pyyaml

Once installed we can interact with YAML pretty much the exact same syntax as the json module.

>>> import yaml
>>> yaml.dump({'a': 1, 'b': 2}, open('example.yaml', 'w'))
>>> yaml.safe_load(open('example.yaml'))
{'a': 1, 'b': 2}