“Web scraping technology is not only for business, but also for solving social problems,” says Juras Juršėnas
For over seven years, Oxylabs has been the leading provider of premium proxies and public web data collection solutions that help businesses of all sizes unlock the potential of big data. Juras Juršėnas, Chief Operating Officer at Oxylabs, has established itself as an expert in IT and product management with over 16 years of industry experience. His ability to apply strategic problem solving, critical thinking and people management skills led him to fill the position of COO at Oxylabs, a global leader in premium proxy and public web data scraping solutions. Jura’s day-to-day work revolves around innovation management, which often involves doing something that has never been done before. He is enthusiastic about technology and the possibilities it brings with it. In an exclusive interview with Analytics Insight, Juras shared his perspective on the company, its achievements, challenges and the future of the ethical web scraping industry.
1. Please tell us about the company, its specialization and the services your company offers.
Oxylabs is the leading provider of tools and solutions for large-scale public web data collection. Providing an infrastructure for ethical web scraping is an essential part of our daily operations.
I joined the company almost four years ago and to this day it is fascinating to be part of the Oxylabs team. In my opinion, our conscious commitment to innovation and ethics sets us apart from the rest of the competition. All of our business partners are rewarding to work with, whether they are Fortune Global 500 companies or startups who want to be the next unicorns. Please tell us about the products/services/solutions you offer your customers and how they add value. We provide tools and solutions for companies looking to collect publicly available data at scale. Our product catalog includes proxies and ready-to-use web data collection solutions such as Scraper APIs.
Our typical proxy infrastructure customers are large companies with internal resources to run their web scraping activities. You only need our extensive proxy network infrastructure to distribute your data request or receive specific geolocated data. Other companies are opting for ready-to-use tools like Scraper APIs, which are perfect for companies that prefer to work on analyzing data rather than data collection. The solution consists of three different products – Ecommerce Scraper API, SERP Scraper API and Web Scraper API – each designed to collect public web data from various sources on the Internet. The simplified process is especially beneficial for smaller companies to get results faster and stay competitive with large companies for business insights.
Web scraping is used extensively in e-commerce. E-commerce companies collect data for market research, competitive analysis, trying to understand consumer sentiment and predicting which goods will be trending.
Financial companies also use web scraping to analyze and evaluate companies and find new customers. These companies rely on technology for risk management and due diligence.
Meanwhile, for certain companies, web scraping is the basis of their business operations. For example, travel price aggregators and price comparison websites rely on this technology.
In summary, we have the most comprehensive proxy network infrastructure and the most diverse range of IP addresses from different countries and cities. Our ready-to-use solutions effortlessly deliver web data to our customers.
2.What is your biggest USP that differentiates the company from the competition?
As already mentioned, we put a lot of effort into research and development. We assembled an AI/ML Advisory Board of five industry and academic leaders, including Stripe and former MIT/NASA representatives. The board supports Oxylabs in product development processes and pushes the boundaries of ethical web scraping technology.
We are very proud that our team is constantly developing new solutions. As a result of our efforts, we hold dozens of patents for our solutions and infrastructure. Web scraping is not a simple technology, and unexpected things often happen. Web scrapers break and parsing pipelines run into problems due to ever-changing website layouts. That’s why we consciously focused on innovation from day one to remove all obstacles in the background. Another unique selling proposition is our ethical approach to everything we do. For example, the procurement of building permits. These proxies redirect internet traffic through physical devices owned by real people. To implement a fair practice model, users must provide recorded, explicit consent, and network participants must be compensated where possible. Well, that’s our stance. Unfortunately, many companies employ methods that make people unaware that their device is an active proxy (exit node) for a third party.
To ensure that a fair practice model is implemented, we have created a Tier A+ model that marks all fair practice checks: ensuring explicit consent and fully informing and rewarding users for participating in the proxy -Network. Mention some of the awards, achievements, recognition, and customer feedback that you consider notable and valuable to the company.
We see an increasing need to show how important web scraping technology is not only for business, but also for solving social problems. So as part of that effort, we’ve started a new pro bono program called Project 4ß. Through 4ß, Oxylabs provides academic and non-profit organizations with free technical expertise, public Internet data collection infrastructure, and resources on a pro bono basis.
For example, after winning the Govtech Lab Challenge, Oxylabs partnered with RRT – a Lithuanian organization that oversees Lithuania’s electronics, postal and railway sectors. It protects the Internet from illegal and dangerous information. The challenge was to automate the identification of illegal content, especially related to child sexual abuse or pornography, in the Lithuanian IP address space. Oxylabs produced a unique specialty tool that RRT had incorporated into its regular operations in early 2022.
On the business side, Oxylabs has been named Proxyway’s Best Proxy Provider for a few years in a row. We were also recognized as Europe’s fastest growing public web data collection provider in 2022 by the Financial Times. These recognitions would not be possible without the dedication of our amazing 400 employees to reach new heights, kudos to them.
3.Please mention some of the biggest challenges the company has faced so far.
Developing web scraping solutions usually requires a lot of work. Being a pioneer is an incredible feeling, but it makes the tasks much more challenging. Being a frontrunner means setting the pace and putting in the work. This means that many other companies look up to us and emulate the models we develop.
We are happy to help customers maintain an uninterrupted flow of data. Infrastructure support, writing, code scraping, and everything else requires resources. With the increasing novelty of the sector, we are constantly confronted with new problems.
Web scraping has yet to gain public awareness. To date, there are only a few laws worldwide.
So, to continue protecting our industry from within, we formed an Ethical Web Data Collection Association with four other organizations. The EWDC represents the interests of companies that rely on web scraping technology.
Our goal is to bring these companies together and advocate for best practices, help develop industry standards, and educate the public on how important web scraping technology is to businesses and consumers alike.
4. Where do you see growth for the industry?
While there are established industries like e-commerce and finance, there are also new ways to leverage collected data. Most of the data has historically come from internal sources, with a few external providers rounding out the picture. Recently, however, there has been a shift towards online scraping as the primary method of automating alternative data collection.
The alternative data industry is worth nearly $3 billion. However, the industry is still in its infancy. In our new study, we found a trend of ever-growing demand for public data to gain insights and stay relevant.