Herdagdelen
Amac Herdagdelen, Cambridge, MA US
Patent application number | Description | Published |
---|---|---|
20110295840 | GENERALIZED EDIT DISTANCE FOR QUERIES - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a generalized edit distance for queries. In one aspect, a method includes selecting query pairs of consecutive queries, each query pair being a first query and a second query consecutively submitted as separate queries, each first and second query including at least one term. For each query pair, the method includes selecting term pairs from the query pair, each term pair being a first term in the first query and a second term in the second query; and determining a co-occurrence value for each term pair. The method also includes determining transition costs based on the co-occurrence values for term pairs, each transition cost indicative of a cost of transitioning from a first term in a first query to a second term in a second query consecutive to the first query. | 12-01-2011 |
Amac Herdagdelen, Mountain View, CA US
Patent application number | Description | Published |
---|---|---|
20140012855 | Systems and Methods for Calculating Category Proportions - Systems and methods are provided for classifying text based on language using one or more computer servers and storage devices. A computer-implemented method includes receiving a training set of elements, each element in the training set being assigned to one of a plurality of categories and having one of a plurality of content profiles associated therewith; receiving a population set of elements, each element in the population set having one of the plurality of content profiles associated therewith; and calculating using at least one of a stacked regression algorithm, a bias formula algorithm, a noise elimination algorithm, and an ensemble method consisting of a plurality of algorithmic methods the results of which are averaged, based on the content profiles associated with and the categories assigned to elements in the training set and the content profiles associated with the elements of the population set, a distribution of elements of the population set over the categories. | 01-09-2014 |
20140278353 | Systems and Methods for Language Classification - Systems and methods are provided for classifying text based on language using one or more computer servers and storage devices. In general, the systems and methods can include a language classification module for classifying text of an input data set using the output of a training module. In an exemplary embodiment, a bootstrapping step feeds the output of the language classification module back into the training module to increase the accuracy of the language classification module. By iterating the language classification and training modules with input data having certain features, a user can tailor the language classification module for use with text having those or similar features. | 09-18-2014 |
Amac Herdagdelen, Izmir TR
Patent application number | Description | Published |
---|---|---|
20130226950 | GENERALIZED EDIT DISTANCE FOR QUERIES - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a generalized edit distance for queries. In one aspect, a method includes selecting query pairs of consecutive queries, each query pair being a first query and a second query consecutively submitted as separate queries, each first and second query including at least one term. For each query pair, the method includes selecting term pairs from the query pair, each term pair being a first term in the first query and a second term in the second query; and determining a co-occurrence value for each term pair. The method also includes determining transition costs based on the co-occurrence values for term pairs, each transition cost indicative of a cost of transitioning from a first term in a first query to a second term in a second query consecutive to the first query. | 08-29-2013 |