Patent application number | Description | Published |
20090063150 | METHOD FOR AUTOMATICALLY IDENTIFYING SENTENCE BOUNDARIES IN NOISY CONVERSATIONAL DATA - Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries. | 03-05-2009 |
20110208728 | Computer System, Method, and Computer Program For Extracting Terms From Document Data Including Text Segment - A computer system, method, and article of manufacture for extracting a term from electronic document data that includes a text segment. The system includes: a first extraction unit that uses a first text processing information to extract a noun word from the document data; a second extraction unit that uses a second text processing information to extract a term candidate in relation to the noun word or a corpus that includes text data described in the same language used in the document data; a weight assignment unit that uses a third text processing information to select which type to assign a weight from the plurality of types and assigns the weight to the selected type for each noun word and term candidate; a determination unit that determines the type to which the noun word and term candidate belong; and an output unit to output the noun word and term candidate. | 08-25-2011 |
20120065960 | GENERATING PARSER COMBINATION BY COMBINING LANGUAGE PROCESSING PARSERS - A computer implemented method, a computer system, and a program for generating a parser combination. The method includes: generating a parser combination by combining parsers each associated with at least one grammar description, where the step is carried out using (i) at least one grammar description means and (ii) a computer device. The computer system includes: a processor, a memory connected to the processor, and a parser generator for generating a parser combination in the memory by combining parsers each associated with at least one grammar description, and at least one grammar description type means. | 03-15-2012 |
20120226465 | METHOD, PROGRAM, AND SYSTEM FOR GENERATING TEST CASES - To provide a technique for generating, at a high speed, a smaller-sized set that satisfies an intended property such as, for example, being pair-wise, and includes many test cases that match a set of existing test cases given as an input, candidates to be used from a set of existing input test cases are determined in the following manner: for some parameters, values to be held by test case candidates are determined; test cases having the determined values, among those included in the set of input test cases, are selected as the candidates. A test case having the highest score among one or more test case candidates generated with the method of the related art and one or more test case candidates selected from the set of input test cases is added to a set of output test cases. | 09-06-2012 |
20120330598 | METHOD, PROGRAM, AND SYSTEM FOR GENERATING TEST CASES - To provide a technique for generating, at a high speed, a smaller-sized set that satisfies an intended property such as, for example, being pair-wise, and includes many test cases that match a set of existing test cases given as an input, candidates to be used from a set of existing input test cases are determined in the following manner: for some parameters, values to be held by test case candidates are determined; test cases having the determined values, among those included in the set of input test cases, are selected as the candidates. A test case having the highest score among one or more test case candidates generated with the method of the related art and one or more test case candidates selected from the set of input test cases is added to a set of output test cases. | 12-27-2012 |
20130253916 | EXTRACTING TERMS FROM DOCUMENT DATA INCLUDING TEXT SEGMENT - A computer system, method, and article of manufacture for extracting a term from electronic document data that includes a text segment. The system includes: a first extraction unit that uses a first text processing information to extract a noun word from the document data; a second extraction unit that uses a second text processing information to extract a term candidate in relation to the noun word or a corpus that includes text data described in the same language used in the document data; a weight assignment unit that uses a third text processing information to select which type to assign a weight from the plurality of types and assigns the weight to the selected type for each noun word and term candidate; a determination unit that determines the type to which the noun word and term candidate belong; and an output unit to output the noun word and term candidate. | 09-26-2013 |
20130262088 | Computer-Implemented Method, Program, and System for Identifying Non-Self-Descriptive Terms in Electronic Documents - A computer-implemented method, program, and system for identifying non-self-descriptive terms in electronic documents. The computer-implemented method for identifying a non-self-descriptive term in an electronic document, includes a memory and a processor communicatively coupled to the memory and configured to execute the steps of a method. The method includes acquiring a noun included in the corpus data. The method further includes calculating a qualifying level and a qualified level in the corpus data related to each known in the corpus data. The method further includes identifying one or more nouns included in the corpus data as having a qualifying level and/or qualified level satisfying a predetermined condition. The method further includes presenting a term related to one or more of the nouns in the electronic document as a candidate for the non-self-descriptive term in the electronic document. | 10-03-2013 |
20140295903 | PEER-TO-PEER EMERGENCY COMMUNICATION USING PUBLIC BROADCASTING - A method for emergency communication includes encoding a message for visual display including a message to field and a message from field. The visual display is revealed to a reading device in communication with a broadcast center, which stores the visual display. The messages are decoded and sorted from visual displays at the broadcast center. On an index channel, a time and channel number for when a message body of the message will be broadcast on a message channel is broadcasted. The message body is broadcasted on the message channel. | 10-02-2014 |
20150046150 | IDENTIFYING AND AMALGAMATING CONDITIONAL ACTIONS IN BUSINESS PROCESSES - Methods and systems for identifying conditional actions in a business process are disclosed. In accordance with one such method, text fragments are extracted from input documents. In addition, a plurality of pairs of the text fragments that respectively include text fragments that are similar according to a pre-defined similarity standard are determined. For each pair of at least a subset of the pairs, at least one difference between the text fragments of the corresponding pair is determined. Further, at least two particular pairs of the subset of the pairs are merged in response to determining that the particular pairs have at least one of the determined differences in common. Additionally, the merged particular pairs are output to indicate the conditional actions in the business process. | 02-12-2015 |