Featured Articles

Detecting (Malicious) Unicode in GitHub PRs
Crowdfunding on Crowd Supply (Review of my experience)
Nightmare on Lemmy Street (A Fediverse GDPR Horror Story)
WordPress Multisite on the Darknet (Mercator .onion alias)
Continuous Documentation: Hosting Read the Docs on GitHub Pages (2/2)
Trusted Boot (Anti-Evil-Maid, Heads, and PureBoot)
Introducing BusKill: A Kill Cord for your Laptop
WordPress Profiling with XHProf (Debugging & Optimizing Speed)
Hardening Guide for phpList
previous arrow
next arrow

Detecting (Malicious) Unicode in GitHub PRs

Detecting Malicious Unicode in GitHub Pull Requests

This article will describe how you can utilize GitHub Actions to scan user-contributed PRs for unicode and automatically warn you if such commits contain (potentially invisible & malicious) unicode characters.

Why

Last month Trojan Source was published --- which described how malicious unicode characters could make source code appear benign, yet compile to something quite malicious.

Michael Altfield

Hi, I’m Michael Altfield. I write articles about opsec, privacy, and devops ➡

About Michael


. . . → Read More: Detecting (Malicious) Unicode in GitHub PRs

RegEx 2 DFA in Python

For my Discrete Mathematics II course at UCF (COT4210), I had to do some implementation with Finite State Machines. My favorite of our tasks (though the most difficult) was to convert a Regular Expression (RE) to an equivalent Deterministic Finite Automata (DFA). And since our professor let us use any language, I tried to branch out from Java & C (which are annoyingly overused in Academia). I decided to teach myself Python. And it turns out, it was a good choice too--considering it's wonderful built-in functionality for Lists, and the heart of this program is a huge 2D array defining the automata's transition function. Also, I miss scripting languages--especially when I'm writing a program as a learning experiment as opposed to trying to make it as efficient as possible.

So, without further Ado: here's my code. It reads a RE in postfix notation from input.txt. Two cautions about postfix REs:

You must explicitly state concatenation The Kleen Star is already a postfix operator in REs, so it doesn't really work to use a mathematical infix2postfix library, as it treats the kleen star like an infix multiplicative operator. I treat it as an operand and throw it directly into the
. . . → Read More: RegEx 2 DFA in Python