Skip to main content

Information extraction

 

  1. Information extraction is the process of extracting specific (pre-specified) information from textual sources.
    One of the most trivial examples is when your email extracts only the data from the message for you
    to add in your Calendar.

  2. Gathering detailed structured data from texts, information extraction enables:


  • The automation of tasks such as smart content classification, integrated search, management and delivery;

  • Data-driven activities such as mining for patterns and trends, uncovering hidden relationships, etc.



How Does Information Extraction Work?

typically, for structured information to be extracted from unstructured texts, the following main subtasks are involved:

  • Pre-processing of the text – this is where the text is prepared for processing with the help of computational
    linguistics tools such as tokenization, sentence splitting, morphological analysis, etc.

  • Finding and classifying concepts – this is where mentions of people, things, locations, events and other pre-specified
    types of concepts are detected and classified.

  • Connecting the concepts – this is the task of identifying relationships between the extracted concepts.

  • Unifying – this subtask is about presenting the extracted data into a standard form.

  • Getting rid of the noise – this subtask involves eliminating duplicate data.

  • Enriching your knowledge base – this is where the extracted knowledge is ingested in your database for further use.

Typical Information Extraction Applications

Information extraction can be applied to a wide range of textual sources: from emails and Web pages to reports, presentations,
legal documents and scientific papers. The technology successfully solves challenges related to content management and knowledge
discovery in the areas of:

  • Business intelligence (for enabling analysts to gather structured information from multiple sources);

  • Financial investigation (for analysis and discovery of hidden relationships);

  • Scientific research (for automated references discovery or relevant papers suggestion);

  • Media monitoring (for mentions of companies, brands, people);

  • Healthcare records management (for structuring and summarizing patients records);

  • Pharma research (for drug discovery, adverse effects discovery and clinical trials automated analysis).

Comments

Popular posts from this blog

Write a code simulating ARP /RARP protocols

   Write a code simulating ARP /RARP protocols . Aim:        To write a java program for simulating ARP/RARP protocols ALGORITHM: server 1. Create a server socket and bind it to port. 2. Listen for new connection and when a connection arrives, accept it. 3. Send server ‟ s date and time to the client. 4. Read client ‟ s IP address sent by the client. 5. Display the client details. 6. Repeat steps 2-5 until the server is terminated. 7. Close all streams. 8. Close the server socket. 9. Stop. Client 1. Create a client socket and connect it to the server ‟ s port number. 2. Retrieve its own IP address using built-in function. 3. Send its address to the server. 4. Display the date & time sent by the server. 5. Close the input and output streams. 6. Close the client socket. 7. Stop. Program Program for Address Resolutuion Protocol (ARP) using TCP Client: import java.io.*; import java.net.*; impor...

Create a socket for HTTP for web page upload and download

Create a socket for HTTP for web page upload and download. Aim: To write a java program for socket for HTTP for web page upload and download . Algorithm 1.Start the program. 2.Get the frame size from the user 3.To create the frame based on the user request. 4.To send frames to server from the client side. 5.If your frames reach the server it will send ACK signal to client otherwise it will send NACK signal to client. 6.Stop the program Program : Client import javax.swing.*; import java.net.*; import java.awt.image.*; import javax.imageio.*; import java.io.*; import java.awt.image.BufferedImage; import java.io.ByteArrayOutputStream; import java.io.File; import java.io.IOException; import javax.imageio.ImageIO; public class Client{ public static void main(String args[]) throws Exception{ Socket soc; BufferedImage img = null; soc=new Socket("localhost",4000); System.out.println("Client is running. ");  try { System.out.println("Reading image from disk. "); im...

Write a JSP which insert the details of the 3 or 4 users who register with the web site by using registration form. Authenticate the user when he submits the login form using the user name and password from the database

Write a JSP which does the following job   Insert the details of the 3 or 4 users who register with the web site (week9) by using registration form. Authenticate the user when he submits the login form using the user name and password from      the database (similar to week8 instead of cookies). Description JSP scripting elements let you insert Java code into the servlet that will be generated from the current JSP page. There are three forms: Expressions of the form  <%= expression %>  that are evaluated and inserted into the output, Scriptlets of the form  <% code %>  that are inserted into the servlet's  service  method, and Declarations of the form  <%! code %>  that are inserted into the body of the servlet class, outside of any existing methods. Each of these is described in more detail below. JSP Expressions A JSP  expression  is used to insert Java values directly...