TwitterLinkedin

Contact Us

  • Biopharma Excellence
  • News & Events
    • Industry News Articles
    • Press Releases
    • Webinars
    • Events
    • Blog
    • Podcasts
  • Resource Library
    • Thought Leadership
    • Fact Flyers
    • Case Studies
    • White Papers
    • Webinars
    • Infographics
    • Video & Animation
  • Careers
    • Life at PharmaLex
    • Current Opportunities
  • Training
Top Bar
Search
PharmaLex Logo
MENUMENU
  • PharmaLex Logo
  • PharmaLex Logo
  • About UsAbout Us
    • About Us
    • Management Team
    • Corporate Social Responsibility
    • Sustainability
    • What Our Clients Say About Us
  • Our Services
    • Discovery / Non-clinical
          • Go to Discovery / Non-clinical section >>

          • Strategy and Consulting

            • Integrated Product Development
            • Market Access
            • Scientific Advice
            • Statistics and Data Sciences
            • Toxicology Services
    • Clinical Development
          • Go to Clinical Development section >>

          • Strategy and Consulting

            • Clinical Program Development
            • Scientific Advice
            • Statistics and Data Sciences
          • Regulatory Affairs

            • Clinical Trial Applications
            • Global Procedure Management
            • Health Authority / Agency Interaction
            • Regulatory Operations
          • Pharmacovigilance

            • Clinical Trial Safety Support
            • Pharmacovigilance Consulting
          • Quality

            • GxP Services
            • Interim / Contract QA
            • Quality Management Systems
            • Tech Transfer / Scale-up
    • Authorization / Approval
          • Go to Authorization / Approval section >>

          • Strategy and Consulting

            • Market Access
          • Regulatory Affairs

            • CMC Services
            • Global Procedure Management
            • Health Authority / Agency Interaction
            • Marketing and Labeling Activities
            • Regulatory Operations
            • Scientific, Regulatory and Technical Writing
            • Statistics and Data Sciences
          • Pharmacovigilance

            • EU QPPV / National QPPV
            • Pharmacovigilance Consulting
          • Quality

            • Commercialization Readiness
            • Commissioning, Qualification, and Validation (CQV)
            • GxP Services
            • Interim / Contract QA
            • PAI Readiness
    • Post-approval / Maintenance
          • Go to Post-approval / Maintenance section >>

          • Strategy and Consulting

            • Market Access
          • Regulatory Affairs

            • CMC Services
            • Global End-to-End Outsourcing
            • Global Procedure Management
            • Health Authority / Agency Interaction
            • Marketing and Labeling Activities
            • Mergers and Acquisitions (M&A) Transfers
            • Regulatory Operations
            • Toxicology Services
          • Pharmacovigilance

            • EU QPPV / National QPPV
            • ICSR (Individual Safety Case Report) Management
            • Literature Monitoring & Screening
            • Pharmacovigilance Quality & Compliance
            • Signal Management
          • Quality

            • Commissioning, Qualification, and Validation (CQV)
            • GxP Services
            • Interim / Contract QA
            • Quality Management Systems
          • Medical Affairs

            • Healthcare Compliance and Medical Approval
    • Program Management
          • Go to Program Management section >>

          • Program Management

            • Global End-to-End Outsourcing
            • Global Procedure Management
            • Integrated Product Development
            • Mergers and Acquisitions (M&A) Transfers
    • Featured Expertise
          • Industry Expertise

            • Biopharmaceuticals
            • MedTech Services
          • Service Expertise

            • ATMP / Cell and Gene Therapy
            • COVID-19 Support
            • GxP Services
            • Market Access
            • Pharmacovigilance Consulting
            • Post-Brexit Regulatory Support
            • Statistics and Data Sciences
            • SMARTPHLEX - technology-enabled services
  • Global ReachGlobal Reach
  • Contact usContact us
  • News & Events
    • Industry News Articles
    • Press Releases
    • Webinars
    • Events
    • Blog
    • Podcasts
  • Resource Library
    • Thought Leadership
    • Fact Flyers
    • Case Studies
    • White Papers
    • Webinars
    • Infographics
    • Video & Animation
  • Careers
    • Life at PharmaLex
    • Career Opportunities
  • Training
Home > News & Events > Industry News Articles > Data Science on Blockchain with R. Part I: Reading the blockchain

Data Science on Blockchain with R. Part I: Reading the blockchain

By Thomas de Marchin (Senior Manager Statistics and Data Sciences at Pharmalex) and Milana Filatenkova (Manager Statistics and Data Sciences at Pharmalex)

JUN2021

CryptoPunks are the earliest versions of NFTs (www.larvalabs.com/cryptopunks). Image modified from https://commons.wikimedia.org/wiki/File:Cryptopunks.png.

Introduction

What is the Blockchain: A blockchain is a growing list of records, called blocks, that are linked together using cryptography. It is used for recording transactions, tracking assets, and building trust between participating parties. Primarily known for Bitcoin and cryptocurrencies application, Blockchain is now used in almost all domains, including supply chain, healthcare, logistic, identity management… Hundreds of blockchains exist with their own specifications and applications: Bitcoin, Ethereum, Tezos…

What are NFTs: Non-Fungible Tokens are used to represent ownership of unique items. They let us tokenize things like art, collectibles, even real estate. They can only have one official owner at a time and they’re secured by the Blockchain, no one can modify the record of ownership or copy/paste a new NFT into existence. You’ve probably heard of the artist Beeple who sold one of his NFT art for $69 million.

What is R: R language is widely used among statisticians and data miners for developing data analysis software.

Why doing data science on blockchain: As Blockchain technology is booming, there is a growing need in tools and people able to read, understand and summarize this type of information.

What is an API: An API is a software intermediary that allows two applications to talk to each other. APIs are designed to help developers make requests to another software (i.e. downloading information) and getting results in predefined easy to read format, without having to understand how this software works.

There are already several articles dealing with blockchain data analysis in R, but most of them focus on price forecasting. Obtaining data on the cryptocurrencies price is quite straightforward, there are many databases available on internet. But how to actually read the blockchain? In this article, we will focus on reading blockchain transactions. Not all transactions but specifically, transactions related to NFTs. We will read the Ethereum blockchain, probably the top one used to trade NFTs. There exist several market places that that serve as trading platforms for NFTs: OpenSea, Rarible… We will focus here on OpenSea, the largest NFT trading platform at the moment.

Reading the raw blockchain is possible but it is hard. First of all, you have to setup a node and download the content of the blockchain (approximately 7TB at the time of writing). It may take a while to synchronize… Secondly, the data are stored sequentially which requires developing specific tools to follow a transaction. Third, the structure of the block is particularly difficult to read. As an example, a typical transaction on the Ethereum network is shown below. There are between 200 and 300 transactions per block and at the time of writing, we are at block 12586122.

Figure 1: Each block in the chain contains some data and a ‘hash’ – a digital fingerprint that is generated from the data contained within the block using cryptography. Every block also includes the hash from the previous block. This becomes part of the data set used to create the newer block’s hash, which is how blocks are assembled into a chain. Image from ig.com.

Figure 2: Structure of a transaction.

Fortunately for us, there are APIs which facilitate our work.

OpenSea API

OpenSea provides an API for fetching non-fungible ERC721 assets based on a set of query parameters. Let’s have a look:

There is not a lot of explanations on the content of this dataset on the OpenSea website. I thus selected a few columns which seemed to contain interesting information (at least the ones I could understand).



My guess is – here we have:

  • collection_slug: The collection to which the item belongs
  • contract_address: All the sales are managed by a contract (a piece of code / a software) which sends the NFT to the winner of the bid. This is the address of the OpenSea contract. We can see that there is only one address for all the sales, which means that all sales are managed by the same contract.
  • id: A unique identifier for each sale
  • quantity: The number of items sold per transaction (see fungible / semi fungible below). As in the supermarket, you can buy 1 apple or 20.
  • payment_token.name: The cryptocurrency used to buy the item.
  • total_price: The cost paid by the winner. For Ether, this is expressed in Wei, the smallest denomination of ether. 1 ether = 1,000,000,000,000,000,000 Wei (10^18).
  • seller.address: The address of the seller
  • transaction.timestamp: Date of the transaction
  • winner_account.address: The address of the buyer
  • payment_token.usd_price: The price of one token used to make the transaction in USD

Let’s have a look at the distribution of currencies:


We see that most sales are made in Ether (note that Wrapped Ether can be considered the same as Ether), let’s focus on these Ether sales for the rest of the article.


Figure3: Histogram of the price per sale. Data from OpenSea API.

Figure 4: Pie chart of the price per sale. Data from OpenSea API.

Figure 5: Waffle chart of the price per sale. Data from OpenSea API.

This looks pretty nice but there is a big drawback… OpenSea API limits the number of events to the last 300 transactions. There is not that much we can do about it if we use their API. Also, the downloaded transactions are data pre-processed by OpenSea and not the blockchain itself. This is already a good start to have the information about transactions, but what if we wanted to read the the blocks? We have previously seen that retrieving data directly from the blockchain can be quite complex. Hopefully, there are services like Etherscan that allow you to explore Ethereum Blocks in an easy way. And guess what? They also developed an API!

EtherScan API

EtherScan is a block explorer, which allows users to view information about transactions that have been submitted to the Ethereum blockchain, verify contract code, visualize network data… We can therefore use it to read any transaction involving NFTs! EtherScan limits the number of transactions to 10000 in its free version, which is much better than with the OpenSea API and you can still subscribe if you need more.

Where do we start? Let’s focus again on OpenSea: from the data we extracted above, we saw that the address of their contract is “0x7be8076f4ea4a4ad08075c2508e481d6c946d12b”. If we enter this address in EtherScan and filter on the completed transaction (i.e. the transactions validated by the network, not the one waiting to be approved), https://etherscan.io/txs?a=0x7be8076f4ea4a4ad08075c2508e481d6c946d12b, we see an incredible amount of them (848,965 at the time of writing.) Of course, not all of them are related to sales.

Let’s see what we can do with these data.

Figure 6: Histogram of the price per sale. Data from EtherScan API.

Figure 7: Pie chart of the price per sale. Data from EtherScan API.

Figure 8: Waffle chart of the price per sale. Data from EtherScan API.

Note that the price conversion from ETH to USD is not entirely correct. We use current ETH/USD price while some transactions were done some time ago. Even within a single day, there can be substantial price variation! This can be easily solved by retrieving historical ETH/USD price, but this requires an EtherScan Pro account.

Conclusion

This article is an introduction into how to read the blockchain and obtain transactions data that can be analyzed and visualized. Here, we have shown an example of a simple analysis of blockchain transactions – distribution charts of NFT sales prices. This is a good start but there a lot more we could do! In Part II, we will investigate how to go further. We can, for example, follow specific NFTs and explore the duration of them being in possession after release and the rate of their subsequent reacquisition. Please let me know which attributes you may find interesting.

Note that the code used to generate this article is available on my Github: https://github.com/tdemarchin/DataScienceOnBlockchainWithR-PartI

If you want to help me to subscribe to an EtherScan Pro account and be able to retrieve more transactions, don’t hesitate to donate to my ETH address: 0xf5fC137E7428519969a52c710d64406038319169

References

https://docs.opensea.io/reference

https://www.dataquest.io/blog/r-api-tutorial/

https://ethereum.org/en/nft

https://influencermarketinghub.com/nft-marketplaces

https://www.r-bloggers.com/

https://etherscan.io/

https://en.wikipedia.org/wiki/Blockchain

Contact us now
Related posts
Why collaboration and an Integrated Product Development program are key to successful drug development
Why collaboration and an Integrated Product Development program are key to successful drug development
14th June 2022
Key Takeaways from the PDA Europe Annex 1 Workshop
Key Takeaways from the PDA Europe Annex 1 Workshop
7th June 2022
NLS DAYS 2022
NLS DAYS 2022
September 28 - May 29th, 2022
Addressing Limitations of Sterility Testing
Addressing Limitations of Sterility Testing
9th May 2022
Are virtual audits sustainable post-pandemic?
Are virtual audits sustainable post-pandemic?
6th May 2022
Orphan Drug Designation: Securing the Significant Benefits
Orphan Drug Designation: Securing the Significant Benefits
28th March 2022
Search
Upcoming Events

September 11 - 13th, 2022

2022 RAPS Convergence, Phoenix, USA – 11 to 13 September 2022

Visit us at booth 507

Categories
  • All News
  • Webinars
  • Events
Archive
Biopharma Excellence Website Image

PharmaLex Brings You Biopharma Excellence

Biopharma Excellence is a fusion of three scientific powerhouses, PharmaLex, ERA Consulting and Biopharma Excellence – all under the PharmaLex brand. This global team of scientific, regulatory and commercial professionals provide strategic product development and proactive regulatory services to developers of biopharmaceuticals, cell and gene therapies, monoclonal antibodies (MABs), vaccines and biosimilars.

Visit Website
USEFUL LINKS
  • Home
  • Contact us
  • Imprint
  • Data Protection
  • Terms and Conditions
RECENT TWEETS
pharmalexglobal PharmaLex @pharmalexglobal ·
17h

📢  Listen to our expert Clare Huntington in our latest podcast, How the #MDR has changed the roles and responsibilities of “economic operators” to learn more.

👉 https://lnkd.in/d5GgSTHb

#medicaldeviceregulation

COVID-19 NEWSLETTER SIGN UP

SIGN UP NOW

WHAT OUR CLIENTS SAY
  • Worked with us and our contractors to help us achieve our tight targets, operate with our systems and procedures and are considered invaluable part of our team

    US based large development group
    VP Regulatory / Quality
PharmaLex
©2022 PharmaLex GmbH. All rights reserved.

ISO9001:2015 LogoEnergie Audit LogoSGS Logo

Suspicious Emails

We are aware of a number of suspicious emails about recruitment in circulation purporting to be from PharmaLex. Emails sent by PharmaLex will originate from @pharmalex.com. Should you receive an email and are unsure as to its validity, please report it to contact@pharmalex.com.

Many thanks for your understanding.

The PharmaLex Team

    Please DO NOT send us event/conference information.
    We will not respond and these will be deleted immediately.





    Select your state:

    .
    If you do not wish to receive any communication from us, you may unsubscribe at any time.

    #AskTheExpert #TogetherBEYONDCOVID19





      Select your state:

      .
      You can unsubscribe at any time at data.protection@pharmalex.com

      If you do not wish to receive any communication from us, you can unsubscribe at any time at. Click here to view our Privacy Policy.

      #TogetherBEYONDCOVID19

      We closely monitor developments and updates surrounding the current outbreak of Coronavirus Disease 2019 (COVID-19) from official sources including the World Health Organisation and are following the guidance and direction of the governments and the local authorities. We are taking this situation very seriously and our number one priority is the safety and business continuity for our customers and colleagues. As ever, we will do everything we can to ensure that services are delivered within the required time and the trusted quality PharmaLex is known for