In this blog entry we will look at how to use Providence and its new link analysis feature to identify bot-nets and other coordinated activity.
The subject matter for this test case will be the US political landscape and attempts, both foreign and national to influence voters using coordinated messaging.
While we are using this to explore threats to National political landscapes it can just as easily be setup to identify coordinated behavior in cyber security, such as the use of Twitter botnets to provide Command and Control (C&C) to malware.
It should be noted before we proceed that this blog and test case was not done via pre-defined or canned data. It has been done on a live data feed with information collected over a short period of two days. The blog was written as I worked through the steps, with no knowledge of what the data was or would reveal.
We have setup an Entity in Providence and called it US Political Interference. Collection has been running for two days and we have a total of 159,974 posts collected to date as can be seen in Figure 1 below.
From the main management screen of Providence, we will jump into the link analysis feature and begin exploring these 159,974 posts related to the US political landscape.
We will set up the Graph workspace (Graph workspace is where the link chart will be built and worked with) to look at the following attributes of the associated posts:
- Keyword – the word that triggered the post to be collected;
- Username – the person who made the post;
- Post Date – the time and date of when the post was made
The methodology we will use is intended to identify suspicious coordinated activity by linking Usernames to posts made at the exact same time as one another.
Initially we load the Keywords on their own and then start associated Usernames to the Keywords, build sub-graphs, which are essentially sub-link charts of the same investigation. We then select the Keyword at the centre of each link chart group and delete it, this explodes all the Username into separate “people” in the chart as see below in Figure 2.
Once we have exploded the Usernames out the next step will be to add the Post Dates. We will do this a little at a time watching for anomalies as well as to keep the work area tidy. When working with link charts if you add too many entities at the same time the link chart can quickly become overcrowded and very hard to understand the relationships it contains.
You can see in Figure 3 below groupings of Usernames and time and dates stamps from social media posts have started to form.
As mentioned above, groupings within the chart have started to form, such as shown in Figure 4 below. However, this is not anomalous behavior and is very common. Particularly with media outlets, i.e. a single entity with no connections to other social media profiles posting content.
As we continue to add time and dates in association with Usernames we can see many more single users posting content. However, we can now see an anomalous grouping. In Figure 5 below you can see the overall link chart and normal groupings with the anomalous grouping broken out.
While the above grouping is anomalous it does not necessarily mean it is a coordinated effort. It does happen on occasion that two users post content at the exact same time. Further investigation will be needed to determine what was posted and if the accounts are related in any way.
What we are looking for is this type of grouping where multiple accounts are connected but also collide on multiple time and dates. This would give much more confidence that what you are looking at is a coordinated effort.
The time date stamp linking the two Usernames together was anti-Obama propaganda.
One of the accounts has already been suspended by Twitter, potentially due to content posted in breach of user policies or because it was identified by a bot by Twitters internal algorithms. The other user account appears to be an activist account on face value.
The next step I took was to select the date and time that links the two original social media accounts and add any other usernames (Twitter accounts) that are associated with that particular time/date/post.
As can be seen in Figure 7 above there are sixteen additional social media profiles that collide with this time and date a post was made. This is an anomaly and looks very different from any other groupings in the link chart.
The majority of the associated accounts posted anti-Obama, pro-Trump propaganda at exactly the same time. All the accounts that are still active look to be political activist accounts supportive of Donald Trump’s administration, with profile descriptions such as;
“Born again patriot since we the people put a real leader in the Whitehouse.” and “Patriot thru and thru. Anti-establishment purist.”
Further investigation and analysis would need to be done to determine if each of these accounts was posting in a coordinated fashion with the intent to influence public opinion in relation to US politics. That part of the investigation is beyond the scope of this blog. The intent was to demonstrate a very quick way to collect and analyse large volumes of data to identify potential coordinated efforts to sway public opinion.
However, I will outline some of the main indicators to look for when trying to identify a bot account.
Some of the indicators to look for are:
- Personalised profile data – Bot accounts, particularly if it is a large number being run in coordination may have limited personal information on their profile, the “About” parts of their profile. Including no profile image
- Personalised content – Be aware of profiles that never post personalised content and just re-Tweet others content, this is known as amplification. One main role of bots is to boost the signal from other users by retweeting, liking or quoting them. Further if an account was in fact a real person one would expect to see some personalised content
- Content type – If an account just re-Tweets a variety of content such as widely different businesses and products, this could indicate a bot for hire account
- Account activity – There are a number of studies and research into what bot activity looks like, this ranged from possible activity between 50 to 70 posts per day and high confidence if post count is around 170 per day
- Creation dates – Look at the dates the social media profiles were created. If they were all created on or around the same date this could be an indication of bot accounts or fake accounts being run
- Connections – Look at the profiles connections (other platform users), do they all look the same? For example, all have a level of anonymity, such as no profile pictures
Using the technique described above in Providence will allow a user to quickly identify potential bot-nets or fraudulent social media account working in unison. This coupled with the attributes of fake accounts listed in the points above will help you give further confidence as to whether linked profiles are coincidence or in fact something more nefarious.