Machine learning reveals controversial search ranking factors in credit cards sector

SEO and machine learning are critical for businesses and when it comes to alternative investments and credit cards are critical. More and more business decisions are coming from the technology that highlights product and visibility online. This is what happens in most of the fintech and any business. The bellow study highlight the importance of the controversial search rankings from special Google for the Credit Card Industry.

Summary: Artios, a London-based SEO agency that uses machine learning to determine sector specific ranking factors, analysed the top 100 search results in for 100 search phrases picked at random for the Credit Card industry. Their data modelling team analysed more than 170 potential ranking factors.

Key findings:

  • The volume of inbound NoFollow links are by far strongest predictor of search visibility, potentially debunking the myth that NoFollow links don’t matter. Starting from a low base, a single NoFollow link can improve search rank by 8 places . At higher positions, each time you double the amount of NoFollow links, search rank increases by 2 positions.

  • DoFollow links may not be the golden bullet we thought they were. Of the 170 ranking factors analysed, follow links were 38th in order of importance. The below are all more influential on search performance than DoFollow links.

    • Volume of NoFollow links

    • Server response time

    • Topical relevance. Conveying to search engines a strong sense of what the site is *about* (not, repeat not, keyword density)

  • Emotion matters and likes scary copy. Using language that invokes fear increases rankings. We see higher rankings when fear is moderate. The effect is relatively small (only around half a position on average) but seems statistically significant.

  • Linking out is more important than previously envisioned. Especially for the largest websites. 100,000 outbound links equate to a one-position ranking increase.

  • You’re only as good as your first paragraph. Readability really matters. The study findings suggest that if paragraph 1 falls short on readability, it can’t be mitigated with great copy further down the page.

  • Google lowers its expectations of your site depending on your current position, and rewards accordingly. Google treats sites that are already high ranking differently to those that rank low. Search performance is a different type of game in the lower tier.

NoFollow Links

The biggest predictor of rank by far is by far the number of NoFollow backlinks. For each doubling in the number of NoFollow backlinks we see the average Google Rank increase by 2 positions. This is quite interesting because it shows an exponential effect.

This is big news because the SEO community has argued about whether NoFollow backlinks count or not for years. So we now have statistical evidence to prove that NoFollow backlinks do predict improvements in rankings.

But it could be more correlation than causation; sites that possess significant brand awareness attract NoFollow links and sites that invest in content that satisfies the user query also happen to engage in advertorial campaigns.

NoFollow links predict search rank up to point. There’s a theoretical ceiling on how much of an impact they have

The table and plots show us that it becomes increasingly hard to increase Google Rank as you increase the number of NoFollow Backlinks.

• Going from having no NoFollow links to one Nofollow link increases the average Google Rank by around 8 places.

• However, going from 100 NoFollow Backlinks to 1000 only increases Google Rank by around the same amount.

  • The sites with the most NoFollow links tended to belong to big brands with well known websites.

    • Interestingly, the URL with the highest number of backlinks only ranks 97 for it’s one keyword.


DoFollow Links

DoFollow backlinks by contrast, are the 38th most important Google ranking factor.

Andreas Voniatis, lead data scientist at Artios believes this is good news: “As an industry we’ve long been fixated with earning quality DoFollow links. Most SEO practitioners have felt a pang of disappointment after securing a great link, only to discover the rel=NoFollow tag next to their domain.

“This part of the study suggests these worries are unfounded. Variety of link type (followed and not), referring domain and link quality - essentially an extremely natural link profile - are the key to improved ranking.”

Outbound links

For large sites, linking out matters

In fact for every additional link, the effect on rankings is -0.00001471. This equates to approximately 100,000 links to move a ranking by 1 position. Of course you'd need a lot of content to justify amount of links.

Andreas Voniatis: “This finding is mostly relevant to content heavy sites sitting at the top of their niche, comparison sites and resource sites for example. To justify this volume of outbound links, you’d need tens of thousands, if not hundreds of thousands, of pages on your site. However, smaller sites can emulate this practice by sticking to a rule of including one outbound link per page, minimum.

“I’d personally recommend including outbound links wherever there is a claim or assertion that needs sourcing or wherever there’s an opportunity to enlighten the user further. The fear of leaking traffic should be lower now we know Google rewards the practice of linking out.”


Emotion matters, as does ‘sense’

Using natural language processing (relying on the NLP libraries used provided by the Python programming language), we found that sites with a high probability of evoking fear in their readers have a high probability of ranking well. In our study, each word would be turned into a vector and have certain probabilities assigned to it in terms of the emotions and the sentiment they convey.

It’s also extremely important to give a strong sense of what the site is about. This isn’t achieved through keyword volume.

The wider industry claim is that search engines are less dependent on words. Our analysis bears this out. We found that pages with a high computed likelihood of 'being about' credit cards scored higher rankings on average than web pages that didn't.

This isn't the same as keyword density (percentage of copy words containing keywords) as that is just a keyword count. Concepts look at the relationship between words within the copy and a score is computed to see if the body of text is about a topic. This was also apparent in the title tags.

Andreas Voniatis: “This finding was perhaps the most interesting. As an industry we’ve known that semantics play a large part in search performance, but this is potentially the first piece of evidence to suggest Google uses emotion of language to judge a site’s usefulness.

“Google appears to be applying some degree of regulation by proxy on credit card sites. Industry regulators encourage businesses in the financial services sector to emphasise the risks of their products over the benefits. This typically results in a higher degree of fear emotion in the copy.”

Google’s lowered expectations

“SEO is a lot like elite sport. - It’s a different game entirely when you’re in the lower leagues.“ - Andreas Voniatis

Google treats high ranking (top 20 positions) differently to the bottom 80. The rules are the same, but the game is different. Similar to elite sport, small margins make a big difference at the top, but in the lower tiers, effort and hard work get you a long way.

  • When analysing the top 20 sites we found the following factors increase rankings:

    • Fresher content (Buzzsumo Published Date)

  • Higher First Paragraph readability (measured by FKRE) - increasing your readability from -200 to 200 will increase your rank on average by 1 position

  • Wider link authority variety

  • Overall Sentiment Ranking

  • When analysing the bottom 80 sites we found the following factors increase rankings:

    • Site speed

    • content heaviness

Andreas Voniatis summarised: "The tactics used by top flight brands will be very different to start up sites with less inbound linked content by comparison. According to the stats, those starting afresh will need to focus heavily on building authority."”

What doesn’t matter?

Word count - The analysis indicates that there’s no optimum word count for a web page. The number of words in the body copy simply isn’t a ranking factor. Naturally sites with thin or no copy may present related user experience issues and invoke a high bounce rate, but as far as the credit card sector goes, the sheer number of words doesn’t have an impact.

Social media - Our study found no evidence that social media metrics such as Likes, Shares, Retweets have an impact on rank.

Social media metrics had a zero R-squared value with rankings (zero means zero chance of explaining the relationship). This doesn’t mean that social media has literally no impact on search, but our analysis couldn’t find any statistical evidence.

Other findings

The model shows that everybody starts from a baseline ranking of 86 and the ranking goes higher or lower depending on the ranking factors.

So for example, for every increase in AHREFs rank of the ranking domain, the higher the ranking by 0.59 ranking positions

For every increase in domain authority, we get an improvement in rankings of 0.13 had the largest number of high rankings averaging #6 considering their large coverage of 70 keywords, followed by had the highest average ranking of 3.6 although they only represented 14% of the dataset.

HTML content size, those with less content on average ranked higher by 2 clear positions.

About Artios:

Artios is an Artificial Intelligence (AI) driven SEO agency based in London. Its core services are content machine learning driven strategies for SEO.. Artios uses proprietary AI technology to help deliver predictive and quantified marketing.

About the study:

We used a machine learning model to predict the Google Ranking for for a given variation of a keyword, using all the other variables associated with each keyword

The predictive modelling process used supervised classification and regression as the Google Ranking prediction problem is a regression problem. Our models are resistant to over-predicting, so it will continue to predict well on future data.

Our algorithm looks at the data and selects the variable that best predicts the outcome. Then, it splits the data into two sections, depending on their value of that ranking factor. Each data sample in a ranking factor split further by looking at other ranking factors.

After more than 50 replications of the model, training and testing on different subsets of the data, the average absolute error was 17.71. This means that, on average, the model predicts the Google Ranking wrong by about 18 on data unseen by the model.