Jill – An Intelligent Research Assistant

Developed an interactive concept search engine using IBM Watson’s Bluemix APIs and Python that could comb through scientific papers and journals. The search engine tracked the thought process of the user while aiding literature review making it easy for the user to backtrace their steps. The application harnesses the power of Watson to give context to their searches and helps them find more appropriate papers for developing their research.

Demo Prototype

Problem Statement

We want to create an application that provides a one stop shop for research paper writing. The major problems that we are trying to solve are:

  • Google Scholar
    Services like Google Scholar are not specific enough. They return a series of papers based on your query but are not necessarily in your research area. Our Watson corpus allows you to constrain your domain to the particular area that you want to research in.
  • Workspace clutter
    When you’re researching you will often refer to several papers and their references. When you refer to multiple documents it is hard to keep track of all your references spread out across multiple systems and formats
  • Technical jargon makes querying papers difficult.
    Technical jargon makes entering a domain extremely difficult for novices. If there was a simple contextual way to enter a domain.
  • Difficulty estimating the relevance in the results of another domain
    Given that Biologically Inspired Design works on analogy, we need papers from different domains to map together in term of functionality so that we can perform similarity based design. This becomes particularly difficult when you do not know the relevance of biological inspiration in different domain for example in a domain like architecture or engineering.   

As researchers we realized that doing a literature review for research papers is very time consuming. Finding papers that relate to a particular areas, eg. bio-inspired design study is very difficult.

The current method of doing a literature review or research is to query Google Search or Google Scholar. While this gives you a large number of papers, it has no context to your query and often does not enable to get you deeper into your task. Our approach is a more holistic approach to research and cross referencing that we feel would give students a deeper understanding of the subject matter at hand.

Our application is a web based system that can be used by students or researchers writing a paper on Biological Inspired Design for a more holistic Literature Survey using Watson as a knowledge source. Students can use our application to search for topics that they are interested in by highlighting lines from different papers. Watson will take the highlighted “queries” and return relevant papers and abstracts. With Watsons’ natural language processing power it can surpass google scholar in searching for relevant information.

The application allows a student to create a paper completely from scratch using Watson as the entire knowledge base. Most of the biology papers are already inputted within Watson. We hope that adding more papers into Watson over time would make our application a complete resource for Biologically Inspired Design.

Our application provides an easy provision of references within the paper. We plan to keep a context of the entire tone of the paper to yield relevant results. The literature review will result out of an emergent series of questions that helps you maintain the flow of your research. We intend to use it to give relevant research papers as our research evolves.

Intended Users

The intended user of our application would be any person who needs to do research within a domain. This includes, but is not limited to:

  1. Students
  2. PhD Students
  3. Professors
  4. Post- Doctoral Researchers
  5. Professional Researchers

Use Case

Students researching biologically inspired design. Our sample use case from the previous paper makes this clearer.

  1. A student navigates to our website, and starts a new Research Document on “Rosette Growth in Plant Leaves”.
  2. When he/she highlights the phrase “Rosette growth in Plants”, Watson returns relevant papers on the subject that the student could refer to.
  3. The papers are marked as “read” or “unread” so that students can easily identify what material they have already referred to
  4. When the student clicks on a paper, he can do multiple highlights within the paper of text/reference material he is interested. When he leaves the paper, this material is added to the student’s research paper.
  5. The student can also choose to cite the paper he is reading in his references section automatically.
  6. There is a list of student favorites on the right hand for his/her quick review. This helps the student keep the most relevant papers on top and refer to them often.
  7. Once the reference material has been pulled into his research paper, the student can use this to query Watson again for more detailed information or simply incorporate the information he highlighted into his submission.
  8. Using this the student can form a corpus of documents that he can use as the basis for his study and delve deeper into the subject by researching specific ideas with the power of Watson.

The application can be used to feed Watson knowledge in any number of fields. Students in every field write research papers and having our application as a tool would make the process of writing a research paper easier. In the future the application can be extended to another type of research papers by just adding those papers into the corpus and then we can keep expanding from there.  We would have to make no changes to the original program in order for this expansion to happen.

Value Proposition and Competitors

One of our biggest competitor is Mendeley that provides users with 2GB of space online and extra space for purchase. This makes keeping favorite papers online difficult. There is a desktop application which provides a facility to download the papers you want to your system and use Mendeley. This requires the user to know what he wants in terms of research papers.  The other big research paper assistants are Papers, Quiqqa, Zotero etc. All these softwares are either paid or have no strong domain specific research access. The value that our application adds to this is the following:

  • Maintains all your research in a single coherent location. It makes for better human interaction.
  • Surpasses the problem of heavy research jargon by using Watson that relates answers to more general terms in the corpus.
  • Narrowed, more specific search with Watson managing the context and direction of your research. Gives you a relevant search result.
  • References section automatically updated. Helps you tailor your research as you want.

Design & Architecture

The design of the application is based on the REST architecture. We created an API using the Django framework in Python. We created a frontend using the Foundation user interface library as well as the Angular.JS library to make the API calls.

Jill – Architecture Diagram

Functionality

When you highlight the text you want to ask as a query or would like more information about this will use words in a query and return a list of papers that are related to the particular topic. At this point, you can click on the paper that most closely fits the users need. You will get a small abstract of the paper to decide if the paper is relevant to you or not. There will be two options provided to you.

  1. Add paper to favorites
  2. View full paper

Adding Reference Papers to Favorites

Using the ‘Add paper to list’ will add the paper to the list of papers you have referred. This will be a list that the user can see on the side of their screen. This will give users easy access to the list of papers that they are writing about. The user will not have to keep all the papers open while they are writing, they can just open the paper whenever they want since it is easily accessible. There is also a way to add those paper as references in your research paper. This will get rid of the work the user has to do to manually create citations for the papers they are referring to.

Maintaining a history

We have our application maintain history for the question that you have asked. This is done using a history button that is on the interface. This button helps you go back to a point where you think your research was the most cogent. It gives you a way to maintain context of your research and gives a system to find the most relevant paper which might get lost in the process of your research.

Watson

We decided to target a broader functionality of Watson. Rather than uploading specific question and answers to the corpus we decided our application should be able to access all the questions submitted to Watson. Since our application is focus towards writing research papers it would be beneficial to use all the question and papers that the class uploaded. We benefitted a lot from all the question-answer pairs in Watson. Our application used the IBM Watson Python API to query watson with text highlighted in the user’s paper. We pull the name of the original paper that Watson sends back and use the answer from Watson as the trimmed/sneak peak into the paper. Currently Watson’s corpus has only articles related to Biologically Inspired Design and thus the kinds of queries we could do must be related to that. Our application can be extended so that it can be used by any researcher from students to postdoctoral researchers and include any form of research paper in any journal. We performed pre-processing on the queries we could ask to Watson based on our application. Assuming that a person can ask anything related to design and biology, a lot of those papers were put into Watson. We also maintained a ‘history’ and ‘context’ so that Watson could return to a previous point in the research and get more specific research answers. To do this, we used a simple extra database that stored the ID’s of the Watson returned answers and papers.

0