Skip to main content

Information extraction

 

  1. Information extraction is the process of extracting specific (pre-specified) information from textual sources.
    One of the most trivial examples is when your email extracts only the data from the message for you
    to add in your Calendar.

  2. Gathering detailed structured data from texts, information extraction enables:


  • The automation of tasks such as smart content classification, integrated search, management and delivery;

  • Data-driven activities such as mining for patterns and trends, uncovering hidden relationships, etc.



How Does Information Extraction Work?

typically, for structured information to be extracted from unstructured texts, the following main subtasks are involved:

  • Pre-processing of the text – this is where the text is prepared for processing with the help of computational
    linguistics tools such as tokenization, sentence splitting, morphological analysis, etc.

  • Finding and classifying concepts – this is where mentions of people, things, locations, events and other pre-specified
    types of concepts are detected and classified.

  • Connecting the concepts – this is the task of identifying relationships between the extracted concepts.

  • Unifying – this subtask is about presenting the extracted data into a standard form.

  • Getting rid of the noise – this subtask involves eliminating duplicate data.

  • Enriching your knowledge base – this is where the extracted knowledge is ingested in your database for further use.

Typical Information Extraction Applications

Information extraction can be applied to a wide range of textual sources: from emails and Web pages to reports, presentations,
legal documents and scientific papers. The technology successfully solves challenges related to content management and knowledge
discovery in the areas of:

  • Business intelligence (for enabling analysts to gather structured information from multiple sources);

  • Financial investigation (for analysis and discovery of hidden relationships);

  • Scientific research (for automated references discovery or relevant papers suggestion);

  • Media monitoring (for mentions of companies, brands, people);

  • Healthcare records management (for structuring and summarizing patients records);

  • Pharma research (for drug discovery, adverse effects discovery and clinical trials automated analysis).

Comments

Popular posts from this blog

Create a socket for HTTP for web page upload and download

Create a socket for HTTP for web page upload and download. Aim: To write a java program for socket for HTTP for web page upload and download . Algorithm 1.Start the program. 2.Get the frame size from the user 3.To create the frame based on the user request. 4.To send frames to server from the client side. 5.If your frames reach the server it will send ACK signal to client otherwise it will send NACK signal to client. 6.Stop the program Program : Client import javax.swing.*; import java.net.*; import java.awt.image.*; import javax.imageio.*; import java.io.*; import java.awt.image.BufferedImage; import java.io.ByteArrayOutputStream; import java.io.File; import java.io.IOException; import javax.imageio.ImageIO; public class Client{ public static void main(String args[]) throws Exception{ Socket soc; BufferedImage img = null; soc=new Socket("localhost",4000); System.out.println("Client is running. ");  try { System.out.println("Reading image from disk. "); im...

Write an HTML program to design an entry form of student details and send it to store at database server like SQL, Oracle or MS Access

Write an HTML program to design an entry form of student details and send it to store at database server like SQL, Oracle or MS Access <!DOCTYPE html> <html> <head> <title>PHP insertion</title> <link href="css/insert.css" rel="stylesheet"> </head> <body> <div class="maindiv"> <!--HTML Form --> <div class="form_div"> <div class="title"> <h2>Insert Data In Database Using PHP.</h2> </div> <form action="insert.php" method="post"> <!-- Method can be set as POST for hiding values in URL--> <h2>Form</h2> <label>Name:</label> <input class="input" name="name" type="text" value=""> <label>Email:</label> <input class="input" name="email" type="text" value=""> <label>Contact:</label> <inp...

Write a code simulating PING and TRACEROUTE commands

  Write a code simulating PING command     Aim:   To Write the java program for simulating ping command   Algorithm Step 1: start the program. Step 2: Include necessary package in java. Step 3: To create a process object p to implement the ping command. Step 4: declare one BufferedReader stream class object. Step 5: Get thedetails of the server          5.1: length of the IP address.          5.2: time required to get the details.          5.3: send packets , receive packets and lost packets.          5.4: minimum ,maximum and average times. Step 6: print the results. Step 7:Stop the program.   Program:   import java.io.*; import java.net.*; class pingserver { public static void main(String args[]) { try { String str; System.out.print(" Enter...