business resources

Public Data Collection Is Advancing, But Still Far From Its Full Potential

Contributor Staff

3 Oct 2022, 4:20 pm GMT+1

Public-Data-Collection-Is-Advancing-But-Still-Far-From-Its-Full-Potential.jpg
Public-Data-Collection-Is-Advancing-But-Still-Far-From-Its-Full-Potential.jpg

The web scraping industry is maturing both from the technology and business perspective, however, it still lacks proper regulation. For this reason, key market players are launching an Ethical Web Data Collection Initiative (EWDCI) to share best practices and advocate for common principles. These were some of the main takeaways from this year’s edition of the prominent industry conference — OxyCon.

Organized by a leading public web data gathering solutions provider Oxylabs, OxyCon connected global web scraping experts for a two-day online event. From practical tips for engineers to high-level panel discussions, the conference speakers reviewed the most recent developments in the field.

Allen O'Neill,  CEO and CTO at The DataWorks, argued that while the web scraping industry has been developing rapidly over the years, there’s still so much potential left for the future:

“The web scraping industry hasn’t even scratched the surface with its potential yet. There will be many new unicorns in the industry in the upcoming ten years - those who will be able to harness the power of information extraction (not data extraction, but information extraction) and use that to gain insights that have never been seen before”, - said Allen.

The fast growth of the industry was illustrated by scaling being the hottest topic at OxyCon. Karsten Madsen, CEO at SEO company Morningscore, shared the story of his team moving from small data requests to having to compete with SEO industry giants. According to him, it’s not always about having the most data or the smartest data - it’s about having smarter algorithms to manage it.

Glen De Cauwsemaecker, Lead Crawler Engineer at OTA Insight had another tip for scaling data operations: “Be pragmatic and look for cost-reward balance”, - he recommended to the fast-growing data companies.

Besides the technical challenges of scaling, legal issues are also often close to the top of the list of concerns. The participants of the panel discussion “Lawyers discuss scraping” emphasized the ambiguity and many unclear areas that come with the lack of proper industry regulation. As a result, the industry itself must be proactive in safeguarding it from within and sharing best practices among each other.

In this light, Christian Dawson, Executive Director at I2Coalition made an announcement of a new web scraping industry initiative. I2Coalition, together with 5 public data aggregators - Oxylabs, Zyte, Smartproxy, Coresignal, and Sprious has launched an Ethical Web Data Collection Initiative (EWDCI). The aim of the group will be to promote the industry’s best practices and advocate for beneficial technical standards.

Share this

Contributor

Staff

The team of expert contributors at Businessabc brings together a diverse range of insights and knowledge from various industries, including 4IR technologies like Artificial Intelligence, Digital Twin, Spatial Computing, Smart Cities, and from various aspects of businesses like policy, governance, cybersecurity, and innovation. Committed to delivering high-quality content, our contributors provide in-depth analysis, thought leadership, and the latest trends to keep our readers informed and ahead of the curve. Whether it's business strategy, technology, or market trends, the Businessabc Contributor team is dedicated to offering valuable perspectives that empower professionals and entrepreneurs alike.