Simplified genomes from Atlantic Forest plants (Detoni et al., 2016)
Physics and chemistry were the engine of the 20th century. They originated amazing innovation like nuclear energy, transportation and food production. However, some problems related to energy supply, food safety, environmental quality and global health were insensitive to the chemical and physical engineering approach. Little has been achieved over the past 30-50 years. A new hope came with biotechnology.
It started with the unveiling of the DNA structure in 1953 (Watson and Crick, 1953) that led to the first recombinant organisms in 1980’s and the sequence of the entire human genome in 2001. It has the potential to solve every single problem on the planet.
This strong assumption has a strong basis. Life on Earth is 4 billion years old. During this period, evolution has solved every possible problem that life has faced. These solutions, in the form of structural and functional proteins (enzymes) are stored in the DNA of species. Genomes are the repository of its information and the ultimate form of biodiversity. Accessing and decoding their genomes could unlock these evolutive knowledge to create solutions to problems, either old or new ones. Sequencing and decoding genomes is of extreme importance and value.
Tripp and Grueber, (2011) showed on their report ‘The Economic Impact of the Human Genome’, that a USD 3.8 B Investment in Human Genome Project drove USD 796 B in economic impact creating 310,000 jobs and launching the genomic revolution. An investment of almost USD 4 billion over 10 years has created, after 10 years, an economic impact of almost USD 1 trillion dollars. That is an amazing (and maybe unprecedented?) IRR. This is bioeconomy.
What if we could sequence every living being?
We sequence genomes. We digitalize animals and plant. We digitalize forests and oceans. We digitize life! #MyScienceInATweet
“There are between 5 million and 30 million species on Earth, each one containing many thousands of genes. However, fewer than 2 million species have been described, and knowledge of the global distribution of species is limited. History reveals that less than 1% of species have provided the basic resources for the development of all civilizations thus far, so it is reasonable to expect that the application of new technologies to the exploration of the currently unidentified and overwhelming majority of species will yield many more benefits for humanity. […] This approach, which exploits the vast databases of natural history together with ecological and evolutionary theory, has been given a variety of names, including ecologically driven drug discovery, the biorational approach, and hypothesis-driven drug discovery” (Beattie et al., 2005).
The Organisation for Economic and Commercial Development, in its ‘Bioeconomy 2030’ report (OECD, 2009), estimates that “35% of all chemicals, 80% of all pharmaceuticals and 50% of all agricultural output will come from biotech, contributing with almost 3% of OECD countries GDP”.*
Pharma industry is one example of high prize for drug discovery. More than 50% of all prescribed drugs in the US have active principles coming from plants (Griffo et al., 1997). Captopril is part of the selected list of blockbuster drugs that were developed for U$600 M after a principle extracted from the Brazilian snake Jararaca venom and has a annual revenue of USD 5 B. The size of the market locked in tropical forest is actually estimated to be USD 109 billion (Mendelsohn and Balick, 1995).
What are the odds we will find a new drug or product if we could sequence everything? We don’t need to actually calculate (although, it would break great to) to know that it is high. The best is that, nowadays, we CAN sequence everything!
As DNA sequencing prices goes down at a speed 10 times faster than Moore’s law (figure bellow from Wetterstrand, 2014) we will be able to, very soon, sequence every living organism on earth.
There are several k projects and they are moving fast to million. The Beijing Genomics Institute – BGI has sequenced 3000 rice genomes, the ’10k genome consortium’ is aiming to create a genetic ‘Noah’s Arc’ of biodiversity, Regeneron Pharmaceuticals is among the companies that are carrying on their own 100k human genome projects and, in a dispute similar to the one that sequenced the whole human genome in 2001, Barack Obama gave more than USD 100 M dollars for NIH to create a ‘1 Million human genomes’ project, while Craig Venter announced a company with USD 70 M investment to sequence 100k human genomes a year.
At the same pace that sequencing price is dropping, the cost of analysing the produced sequences is raising (figure bellow from Sboner et al., 2011). To date there is no automated and scalable solution to the problem of processing (assembling, annotating and interpreting genomes) all this data. There is also no convenient or cheap solution for storing the huge amounts of genomic data and making it available online to everyone.
But things are moving fast. We published a pre-print with a simplified version of the genome of 50 species of the Atlantic forest (Detoni et al., 2016). It took us 6 moths to do it and we showed that we can find enzymes from major terpene production pathways in the unique databank of Atlantic forest species we have created. We can now scale-up the sampling, sequencing and bioinformatic pathway to sequence 100, 1000 10.000 species.
That is why it is so important thing is that we keep sequencing genomes.
Detoni MAA, Cardenas RGCCL, Uliano-Silva M, de Freitas Rebelo M. (2016) Gene discovery in Atlantic Forest plant species using GR-RSC simplified genomes. PeerJ Preprints 4:e2316v1 https://dx.doi.org/10.7287/peerj.preprints.2316v1 accessed 12/01/2017
Beattie AJ, Barthlott W, Elisabetsky E, Farrel R, Kheng CT, Prance I. 2005. New Products and Industries from Biodiversity. In: Hassan R, Scholes R and Ash N (eds) Ecosystems and human well-being: current state and trends, Vol. I, Island Press, Washington DC. Chapter 10
OECD (2009), The Bioeconomy to 2030: Designing a Policy Agenda, OECD Publishing, Paris. http://dx.doi.org/10.1787/9789264056886-en accessed 12/01/2017
Grifo F, Newman D, Fairfield A, Bhattacharya B, Grupenhoff J. The origins of prescription drugs. In: Grifo F, Rosenthal J, editors. Biodiversity and Human Health. Washington, DC: Island Press; 1997. pp. 131–163.
Mendelsohn, R., & Balick, M. J. (1995). The value of undiscovered pharmaceuticals in tropical forests. Economic Botany, 49(2), 223–228. http://doi.org/10.1007/BF02862929 accessed 12/01/2017
Sboner A, Mu XJ, Greenbaum D, Auerbach RK, Gerstein MB. 2011. The real cost of sequencing: higher than you think! Genome Biology 201112:125 https://doi.org/10.1186/gb-2011-12-8-125 accessed 12/01/2017
Tripp, S. and Grueber, M. (2011). Economic Impact of the Human Genome Project. Technical report, Battelle Memorial Institute.
Watson JD., Crick FHC. 1953. Molecular Structure of Nucleic Acids. Nature 171:737–738.
Wetterstrand, K. (2014). DNA sequencing costs: Data from the NHGRI Genome Sequencing Program (GSP). Technical report, NIH. https://www.genome.gov/sequencingcostsdata/ accessed 12/01/2017
- These figures could be underestimated since biofuels and unpredicted applications cannot be accounted.