In this workflow, we aim to investigate a novel approach for exploring therapeutic targets in the context of Drug Discovery and patent landscaping. Typically, conventional sequence searches commence with an existing biological sequence, such as an antisense mRNA, peptide, or antibody. This sequence serves as the starting point for conducting a search, where one strives to understand its testing history and its similarity to other sequences. However, in this workflow, we take a step back, not precisely a sequel but rather a prequel. Our focus lies in the early stage, addressing the actions to be taken at that point. Have you ever pondered, from an intellectual property (IP) perspective, how to explore the landscape surrounding a therapeutic target? Alternatively, are you seeking inspiration to comprehend the existing body of knowledge, thereby informing your discovery endeavors and leveraging prior insights?
This is where Patsnap truly shines, as it provides unique functionality and capabilities. It combines three interconnected, yet distinct datasets through a process we call "triangulation." We deeply integrate patent-owned sequence information, along with data from our secret source, CAS. This integration allows us to provide comprehensive insights that are not easily found elsewhere.
What is CAS and why is it valuable?
CAS, an abbreviation for the Chemical Abstracts Service, is a highly esteemed and significant data partner collaborating with Patsnap. The value of CAS stems from its distinctive attributes. Rather than relying solely on extracting sequence data from specific listings or utilizing Optical Character Recognition (OCR) to extract sequences from images, CAS possesses a remarkable advantage in the field of life sciences: manual curation.
At CAS, a dedicated team of chemists and molecular biologists meticulously reviews an extensive range of scientific literature from around the world. Their meticulous efforts aim to extract only the substances that hold significance in terms of novelty, preventability, and inventive steps. This meticulous curation process ensures that CAS provides a focused and refined supplementary dataset. The human element involved in their curation activities serves as an ideal complement to our expertise in AI data processing on our side. Consequently, CAS's manual curation enhances the value and reliability of the data offered by Patsnap.
Old Vs. New Method
You will notice after reading through the workflow how powerful Patsnap is compared to using the traditional old method of attaining a similar outcome. The old method is an extremely manual process where you would need to go through the patent pdfs to find the sequences in the sequence listings and this is further complicated if the document is in another language, such as Mandarin. The entire process can be a bit tedious. Furthermore, in certain cases, biotech companies or research teams may not have a clear target in mind. They might only have a shortlist of 10 to 20 options. If you have to go through this entire exercise multiple times without any assistance, it can be a challenging and headache-inducing task.
Example of sequence listings in a CN Patent document:
Conducting the Search
Method 1 - Drug/Gene Index
Let's look at an example in practice that shows this concept. Instead of starting with a novel sequence, Patsnap offers a valuable feature called the Drug/Gene Index, often overlooked but highly significant. This feature allows users to explore all the sequences associated with a specific biological drug or, alternatively, a potential gene. In today's example, the focus is on the gene CD38, which is closely tied to cellular aging. Esteemed academic figures such as David Sinclair, known for their expertise in the anti-aging field, emphasize the importance of CD38 in relation to inflammation, cellular decay, and its association with disease pathways that contribute to cellular inflammation. This highlights the significance of the Drug/Gene Index in providing valuable insights and enhancing the understanding of complex biological processes.
Navigate to the main page of Bio where you will see the “Drug/Gene Index" tab under “Scenario” on the left-hand side of the screen. Click on the “Gene” tab and enter CD-38 into the search bar:
Click on the search button and the results page will open. Once on the results page, you will see that there are 317 sequences in total:
Let us consider the scenario where we are searching for antibodies. We can use keywords like "immunoglobulin" in the search bar below:
But within the Patsnap Bio platform, we have a dedicated module specifically for searching antibodies which you can utilize instead:
All antibodies and sequences related to antibodies within our database are meticulously indexed. Consequently, we are presented with a substantial dataset comprising numerous antibodies and antibody-associated sequences. By conveniently clicking on the hyperlinks, we can promptly delve into each individual sequence and access a wealth of available information associated with it.
After clicking on the hyperlink, you will be directed to our Sequence listing detail page:
The main overview page will present you with crucial data, including the sequence length, sequence type, the organism associated with the sequence, sequence domains, sources of the sequence, and other pertinent information.
Furthermore, on the left-hand side of the page, you will find three distinct tabs: "patent," "literature," and "other sources." These tabs serve as valuable resources for referencing relevant materials. They encompass patents, scientific papers, and other sources that discuss the sequence under consideration. This information proves particularly useful for conducting Freedom to Operate (FTO) searches, as it provides insights into the novelty of a sequence based on the number of references it has received. Moreover, delving into these references can assist in identifying related sequences that may hold significance in your drug discovery endeavors.
Method 2 - Analytics
You can continue the search in Patsnap Analytics, a powerful platform used to search for patents. You can enter analytics by clicking on the waffle in the top right-hand corner.
Once you enter analytics you will want to go to Field search and select the “title/abstract/claims” field. Here you will be able to search for keywords that appear in the title, abstract, or claims of a patent.
You can search for multiple keywords by putting an AND operator between the keywords, ensuring all results contain your keywords at least once.
To enhance precision and relevance in your search, you can also leverage our $SEN operator. By using this operator between the keywords, the system will identify patents where the keywords appear in the same sentence, enabling you to narrow down your results to those that are more closely related to your query.
In this specific instance, we will continue with the overarching theme of this article and proceed with a search for antibody $SEN CD38
Once you click search, you will be presented with a list of patents that meet your query requirements. You can sort these patents by priority date, in the top left-hand corner, to see the most recent and potentially novel results.
Upon identifying a patent that captures your interest, you can navigate into the patent and access various patent-related details. These include the abstract, claims, and description, along with additional information such as the application date, publication date, estimated expiry date, and more.
Additionally, you can access a tool called the “Sequence Assistant”. This tool scans the patent document and highlights any sequences within the patent. If you hover over the highlighted seq ID NO., you will see the sequence.
Furthermore, you can extract the sequences into Bio using our Seq Extraction tool in the Sequence Assistant.
Upon extraction, you will be directed to the Bio sequence results page. Here, as mentioned earlier, you can click on different sequences to access comprehensive sequence information. Additionally, you have the opportunity to explore the available references and sources that mention each specific sequence.
Method 3 - Connection with Synapse
The third method for identifying sequences involves utilizing our Synapse platform, which offers both free and premium options for users.
To access the freemium option, you can register by following this link: Synapse (AI-Powered Life Sciences Connected Intelligence platform to drive strategic R&D decision making)
Similar to Method 1, conduct a Drug/Gene index search using our Bio Platform, and let's continue with the CD-38 example. Upon clicking into a sequence from the search results page, you will observe that the gene CD38 is hyperlinked on the sequence listing detail page.
By clicking on this hyperlink, you will be redirected to a dedicated overview page specifically focusing on the CD38 gene. At the top-right corner of the page, you will notice a button labeled "View in Synapse."
Upon clicking on this button, you will be seamlessly directed to the dedicated CD38 page within the Synapse platform. On this page, you will have access to a wealth of valuable information pertaining to CD38. The platform presents a comprehensive range of details, including drugs, indications, clinical trials, organizations, patents, literature, and news associated with CD38. Furthermore, you will find informative graphs, such as organization heatmaps and indication heatmaps, among others, which provide visual representations to enhance your understanding of the therapeutic landscape pertaining to the specific drug. This knowledge will prove instrumental in identifying potential diseases to target in clinical trials for potential treatment.
Comments
0 comments
Please sign in to leave a comment.