Holy Forking Shirt Balls!

Posted on 11th May, 2021

How do you filter profanity in mobile games? Here’s a guide from Dominic, one of our Hutch senior engineers.

My name’s Dominic Brittain, and I’ve worked at Hutch for over 8 years. I’ve predominantly worked in the free to play mobile sector throughout my career, previously working for Strawdog Studios and then onto joining Hutch as a startup in early 2013. In this article, I’m going to discuss what Hutch was setting out to achieve with filtering profanity, the approach that was taken, the caveats that were run into and then ultimately the solution that was settled upon, suggesting some improvements that could be made going forward.

The Objective

Profanity filtering is a subject that has been widely researched over the years from both a functional and an optimisation perspective. Any product that has required some amount of content to be flagged as not suitable for their audience will have had to look into ways of detecting profane content and then censoring this content in a way that makes sense to the user on their client device.

A common feature that might require a profanity filter is a name editor, whereby the input text the user is trying to set their name as wants to be checked for offensive language before allowing them to submit it. This becomes even more important when handling more extensive text input features such as chat rooms, where users interact with one another much more frequently, and are much more free to say what they want to.

The big question is how to approach filtering large quantities of content in a sensible and cost effective manner, whilst trying not to intrude too much on the user experience of the product. With all this in mind, our objective at Hutch was to come up with a solution that allowed us to filter content we deem offensive, across a wide range of languages, quickly and in a cost effective way.

The Multi-Language Problem

One of main considerations to take into account when deciding on a solution was the need to support many different languages, each with their own structures, characters and language patterns. This makes it much more difficult to design a system that can automatically catch all cases of profanity without it throwing up issues.

It is also worth noting that some languages vary massively in the way that they are written, for example Asian languages are written using glyphs rather than letters and can mean many different things depending on the accents they have, and then you have languages such as Russian that also contain their own set of special characters, having their own influences on a words meaning. This creates its own set of challenges whilst searching for profanity and determining when and when not to to flag some content.

A good example of how two languages vary in their construction which could throw up potential issues in a profanity detection algorithm is the use of double L in the spanish language, which leads to a sound more like a ‘Y’ when spoken rather than the expected conjoining L sound we are accustomed to in the English language. The issue it raises is the use of these characters within their respective language and what it means when used in a word's construction, so we can’t just assume that all words with certain character patterns will always be unsuitable content. With the numerous language differences across the multi language set we were looking to support, it meant we would need a solution that could handle per language searches to avoid running into content being flagged as profanity when in fact it isn’t.

The Scunthorpe Problem

The Scunthorpe problem is a widely recognised issue when dealing with string comparison techniques. When we put it into the context of a multi-language profanity detection algorithm, it’s not unreasonable to expect that there will be words that contain profane words within them, for example the word Scunthorpe contains a particular profane word’s sequence of characters, so an algorithm that simply cycles through the string characters, matching them as you go, will not always work as intended, and will end up flagging content as profanity when it might not be.

One way to combat this issue is to check forwards and backwards when cycling through a word within some text content, checking against some predefined conditions of what to look for if a word is flagged, and the characters in question don’t contain the starting character of the word being checked. This would catch the issue in most cases, so long as you have the pre-defined conditions set up correctly to begin with, but on the flip side it can also end up letting content slip through that may in fact be profanity, so it can be very difficult to catch all the instances of words that may suffer from the initial problem, without creating a subsidiary issue by trying to fix it in an automated way.

This becomes multiple times more difficult when you are trying to support many different languages, increasing the potential words to be checked many folds over. It’s also worth mentioning that adding more checks to the algorithm will only impact the performance in a negative way, which can be important when you’re talking about large volumes of content being checked on a frequent basis, such as a chat feature.

Another approach that can be taken to tackle this is to simply use a white-list, where a list of words are stored and checked against to specify when a word is definitely not profanity. So in this case, the word ‘Scunthorpe’ could be added to the white list, and then it should avoid being checked all together as the algorithm already knows that it’s a safe piece of content.

Content Normalisation

When designing and writing an algorithm, performance is always a key factor to take into consideration. We want to accomplish what we are setting out to achieve, whilst running at sufficient speed to not cause any bottlenecks to our projects when running it. This is where normalisation comes into effect. We want to take an amount of input text and cut it down as much as we can so that there are less words/characters to check, and as a result, processing the data quicker. This is also a great time to instigate a white list check to remove complete words that are found, removing them all together from the check process.

For profanity detection, there are a number of basic normalisations that can be implemented to really cut down the amount of content we are checking, whilst also making it easier for the algorithm to detect unsuitable content at the same time. These can include stripping special characters (Except in special circumstances such as hyphenated words), Checking for numbers within words that may be used to replace letters (Example ‘7’ in place of ‘T’) or simply removing whitespace.

You can also go further than this and start to add language specific normalisation into the mix. If you know what language the content is in and how a language is supposed to be formed, you can add in a set of rules that strips words down to their bare bones. This process is known as stemming and is widely documented, with various papers on how to approach it for different languages. The tricky aspect to this is detecting the language some content is in, especially if multiple languages can be used within the same sentence, so it isn’t a perfect solution.

One other method of normalisation that is worth mentioning, is the use of something called stop words. We know in the English language that words such as, ‘is’, ‘the’, ‘and’, ‘a’ and ‘her’ are words used to join sentences together. What we can do, is compose a stop word list to check against before sending the content to be checked for profanity, similar to that of the white list. By using this method, we can cut down the number of words to check significantly.

Common Approaches

Like most common problems, there are numerous solutions out there that have already been researched, formulated and been made readily available to access, either free or as a paid piece of software.

The most straightforward way of tackling this problem is to use simple string comparisons to check if a word matches a list of predefined words that should be flagged. Now most profanity-detection solutions that are used, will do this in some form, either as full strings or on a per character basis, but by using only string comparison as the method of detection, it will end up leading to lots of flaws in the algorithm as well as running far slower than it needs to due to no normalisation as discussed above.

In comes the Aho-Corasick algorithm. The idea behind this is to create a search tree known as a Trie that can be pre-constructed from a black-list of words in a memory considerate manner and traversed when checking content in an efficient way.

Now, this method can lead to a fair amount of memory usage if the Trie becomes very large, so there is that to consider, but it speeds up the content processing significantly whilst also lending itself very nicely to normalisation techniques and other optimisation methods such as stop word lists. The other great thing about this way of storing content is that it doesn’t matter what language you are checking, it checks each element of the content from the root of the tree, and if it finds an entire match it will return a positive result.

It’s also worth mentioning that there are some solutions out there, that are readily available at a cost. These include:

Microsoft’s Azure Cognitive Services Content Moderator, offering a wide range of content moderation tools as well as an SDK to tailor a system to your specific needs.
WebPurify (https://www.webpurify.com/), which is widely used on Google and amazon products and has a similar offering of content moderation to that of the Microsoft option.

Both of these options offer solutions that will suffice for most people's needs, but with the big caveat that they can become very costly if you are expecting large volumes of traffic, so it is probably not advisable to solely rely on one of these systems, and instead use them as a fallback to an in house solution.

The Aho-Corasick Hybrid Solution

With all of the above in mind, we settled on a hybrid solution using the Aho-Corasick algorithm with a third party solution as a fallback.

From a structural perspective, we have a master API which wraps the internal functioning of the algorithm and acts as the point of access for a client to interface with. This API tracks three separate search Tries for each supported language, a black list, white list and stop list. Each of these are constructed up front and are taken into account internally when content is being scanned for profanity, instigating any normalisation we want to take into effect.

On the server, we use an MVC controller to access this API which takes any results the algorithm returns and decides what to do with it. This is where we have opted to use a hybrid solution and have added support for Microsoft’s Azure Cognitive Services Content Moderator as a fallback if certain conditions are met. If no profanity is detected by the in house solution, we continue to check a profanity score that we track for each user internally within our applications and determine how much we can trust them to not be using profanity or not, and if they exceed the set threshold, we send the content off to the Azure Content Moderator for a second opinion. If this subsequently returns flagged content that our internal system did not, we proceed to take the content that was detected and add it to our internal database so that we can detect it using the in house solution next time rather than relying on the external service. This gives us the benefit of a fully fledged solution that has ongoing support, whilst reducing the cost of using it by relying more on our in house solution to do a good enough job.

The final component was to allow clients to call the algorithm directly so that they can check for profanity before ever having to send off a request to the server to begin with. The idea is that this uses a stripped back version of the system to save on memory, so in our case the user’s local language + english (if it isn’t their main language). This in theory, would catch most cases of profanity, whilst improving performance and acting as a way to reduce server costs.

Future Improvement Considerations

Despite having this solution in use by real world applications, there are still many ways that it could be improved, both functionally and from a performance point of view.

Firstly, it is memory intensive, especially when the Trie structures are constructed with large content lists, so looking into ways to reduce the memory usage would be beneficial for the system as a whole. Investigation into loading pre-constructed Trie data blocks, rather than constructing them on the fly when our application boots would reduce the applications loading time, and likely reduce the memory usage and garbage collection created.. This would benefit the client’s direct usage of the algorithm more than anything, as they are likely to have far less resources available.

Given the time, adding stemming to the algorithm normalisation for English and potentially other languages would reduce the number of nodes created in the Trie (so a reduction in memory usage), but it would also increase performance as well, due to not having to check so many characters for each piece of content reduced by the stemming process.

Finally, Adding a level of automation for character swap detection such as numbers replacing letters and flagging these as profanity without the need to have them stored directly in the black list would make the system more flexible, and just smarter overall. Having a black list is useful, but it would be far nicer if the system could figure these cases out for us automatically.

If you're a Server Engineer or a Unity Games Engineer looking to start a new chapter in your career, check out our current vacancies here: hutch.io/careers. We'd love to hear from you!

Article References:

Botwicz, Jakub, et al. “IFAC Proceedings Volumes.” AHO-CORASICK ALGORITHM IMPLEMENTATION IN HARDWARE FOR NETWORK INTRUSION DETECTION, vol. 37, no. 20, 2004, pp. 203-207. B07: Aho-Corasick algorithm implementation in hardware for network intrusion detection. Accessed 19 April 2021.

Ganeson, Kavita. “Effects of Text Normalization.”All you need to know about text preprocessing for NLP and Machine Learning, 23 September 2019. Accessed 19 April 2021.

Gaveson, Kavita. “Stemming with Noise Removal.” All you need to know about text preprocessing for NLP and Machine Learning, 23 February 2019. Accessed 19 April 2021.

Martin, Chris. “Trie Example.” Trie, 15 July 2006. Accessed 19 April 2021.