- Security TWENTY Home
Graph technology could be the answer to tackling the growing problem of fraud, writes Emil Eifrem, pictured, of Neo Technology.
UK businesses lose just under a hundred billion pounds every year to fraud, according to a report by the University of Portsmouth’s Centre for Counter Fraud Studies and accountancy firm PKF Littlejohn (http://www.pkf-littlejohn.com/the-financial-cost-of-fraud-2015.php).
Think about it: that’s the cost of Trident, per year, draining out of businesses. Is there anything that can be done about that waste and criminality? While no procedures are 100 per cent foolproof, the same study found that fraud can be reduced by up to 40 per cent if the right security measures are put in place. One of the key ways of doing this is to look beyond the individual data points to the connections between them – joining the dots to uncover any suspicious patterns amongst a deluge of uninteresting transactional data.
Unfortunately these intricate pictures too often go under the radar. But if we could spot this malicious behaviour early enough, that would be a huge aid in terms of preventing or rapidly shutting down fraud. Doing that, though, is never a simple case of dot-to-dot. Making connections between the ‘dots,’ i.e. the data, is hard. The good news is that, thanks to technology, meaningful insight can be mined from these complex datasets – like customer banking or online credit card transactions – detailing new connections and hidden patterns to build up a picture previously invisible.
That new technology that delivers this vision is the graph database. Unlike other ways of managing data, graph databases were developed to express data relationships, doing this by uncovering patterns notoriously difficult to identify using relational databases/SQL tables. As a result, a growing number of enterprises, from banks and financial institutions to online retailers, are adopting graph databases to solve a variety of data problems. Last year, business analyst group Forrester Research forecast that just over a quarter of enterprises will be using graph databases by 2017. Indeed, some of the biggest consumer and e-commerce sites have used graph databases to draw information from online data relationships. But, if industry heavyweights such as Google and LinkedIn had to develop their graph database technology in-house, off-the-shelf graph databases are now available for businesses looking to explore data connections.
This is especially true in the fight to beat criminal activity, especially around fraud.
There are various types of online fraud – banking, insurance, e-commerce, for example. But what they all have in common is layers of deceit to cover up the crime that can only be discovered through deep analysis. In each of these types of fraud, graph databases can assist existing methods of detection, making exposure of crime more efficient and cost-effective. First-party fraud, whether by individuals or organised crime syndicates, is difficult to detect via traditional application screening and account management processes as they have not been developed to look at the right patterns – in this case, shared identifiers. This is where graph databases show their power.
Exposing fraud rings with traditional relational database technologies requires modeling the data as a set of tables and columns, then carrying out a series of complex joins and self-joins. The problem is that these queries are complicated to build and expensive to run. Scaling them to support real-time access to big volumes of data gives rise to major technical challenges, with the chances of success degrading as the size of the criminal ring increases and the total data set enlarges.
Graph databases fare much better here. Query languages like Cypher provide an effective and simple semantic for detecting fraud rings, navigating connections in memory and in real-time. Helping existing fraud detection infrastructures to support ring detection is possible by running entity link analysis queries using a graph database. This is reinforced by running checks during crucial stages in the customer and account lifecycle – best done when an account is created, as soon as a credit balance threshold is hit, and finally when a cheque bounces. Real-time graph traversals linked to these type of events can help banks identify probable fraud rings, during or even before a bust-out (when the gang completes its fraud and disappears) happens. The faster a bank can spot a potential fraud, the faster it can stop it. But the time margins for detecting fraud are getting narrower, calling for real-time solutions, which graphs are central to facilitating.
Another major benefit: connected analysis. Hackers attack when they spot a security gap. Traditional technologies do offer a level of protection, but are not designed to detect increasingly skillful and complex fraud operations. Graph databases, because of their ability to manage connected data, offer a unique ability to break open a large number of key fraud patterns, in real-time, either in groups or on an individual basis. There are many types of fraud where graphs have shown their power. In the UK, for example, insurance fraud, attracts highly professional criminal rings, who have proven very adept at escaping fraud detection measures. Graph databases are a powerful tool in stopping these rings, which may stage ‘accidents’ including fake passengers and fake witnesses to defraud the insurance company, as they can spot the links between perpetrators.
The next step in insurance fraud detection is shaping up to be smarter use of social media – that’s to say, network analysis to seek out potentially fraudulent activity. Connected analysis can reveal relationships between people who are otherwise acting like strangers, for example. Previously such operations have been complex and expensive, particularly for large data sets: graph databases’ ability to draw relationship connections, again, provides an exciting new tool here.
Finally, we have e-commerce fraud carried out by lone wolves. Online transactions are usually linked to identifiers like ID, IP address, geo-location, a tracking cookie, and a credit card number. Relationships between these identifiers should be one-to-one, although some variations may be due to shared devices, for example. However, as soon as the relationships begin to exceed a reasonable number, fraud is usually responsible. As with first-party bank fraud and insurance fraud, graph databases can build up a pattern of discovery in real-time.
The verdict has to be that UK businesses are struggling with identifying and preventing fraud. Graph databases are a way to query the intricate connected networks that underpin a lot of fraud – and help you limit the damage of that £100 billion hidden drag on UK plc’s performance.
The author is co-founder and CEO of Neo Technology, the company behind Neo4j, a graph database (http://neo4j.com/)