Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Gruhl, CA

Daniel Gruhl, San Jose, CA US

Patent application number	Description	Published
20110113012	Operating System and File System Independent Incremental Data Backup - Embodiments of the invention relate to creating an operating system and file system independent incremental data backup. A first data backup of a source system and second version of the data on the source system is received. A second data backup of the second version of the data is created by determining differences between the first data backup and the second version of the data. Each portion of the second version of the data that is the same as a portion of the first data backup is referenced in the second data backup. Each portion of the second version of the data that is different than all portions of the first data backup is included in the second data backup. The second data backup is appended to the first data backup to create an incremental data backup.	05-12-2011
20150220510	INTERACTIVE DATA-DRIVEN OPTIMIZATION OF EFFECTIVE LINGUISTIC CHOICES IN COMMUNICATION - Embodiments of the present invention relate to interactive optimization of messages published to digital media based on past performance of similar messages. In one embodiment, an input token is received. At least one candidate substitute token is retrieved from a dictionary. The dictionary comprises a mapping from the input token to the at least one candidate substitute token. A score associated with the at least one candidate substitute token is determined. A score associated with the input token is determined. The score associated with the input token, the at least one candidate substitute token, and the score associated with the at least one candidate substitute token are outputted.	08-06-2015
20150220643	SCORING PROPERTIES OF SOCIAL MEDIA POSTINGS - Embodiments of the present invention relate to scoring of messages published to digital media based on past performance of similar messages. In one embodiment, an input token is received. A plurality of messages is selected from a corpus of messages. Each of the plurality of messages has a publication time and contents. The contents of each of the plurality of messages include the input token. A plurality of root messages is determined from the plurality of messages. Each of the plurality of root messages relates to at least one related message. The at least one related message is one of the plurality of messages. Each of the plurality of root messages is the earliest message of the corpus of messages related to its at least one related message. A score is determined for the input token based on the plurality of root messages.	08-06-2015

Daniel F. Gruhl, San Jose, CA US

Patent application number	Description	Published
20080208893	SYSTEM AND METHOD FOR ADAPTIVE CONTENT PROCESSING AND CLASSIFICATION IN A HIGH-AVAILABILITY ENVIRONMENT - The embodiments of the invention provide a systems, methods, etc. for adaptive content processing and classification in a high-availability environment. More specifically, a system is provided having a plurality of processing engines and at least one server that classifies data objects on the computer system. The classification includes analyzing the data objects for the presence of a type of content. This can include assigning a score corresponding to the amount of the type of content in each of the data objects. Moreover, the server can remove a data object from the computer system based on the results of the analyzing. The results of the analyzing are stored and the computer system is updated with feedback information. This can include allowing a user to review the results of the analyzing and aggregating reviews of the user into the feedback information.	08-28-2008
20100070495	FAST-APPROXIMATE TFIDF - Our approach seeks to reduce the complexity of this type of calculation through approximation and pre-computation. It is designed to work efficiently with modern relational database constructs for content management. The approach is designed to enable the kinds of highly interactive data-driven visualizations that are the hallmark of third generation business intelligence.	03-18-2010
20100223226	SYSTEM FOR MONITORING GLOBAL ONLINE OPINIONS VIA SEMANTIC EXTRACTION - A system for transforming domain specific unstructured data into structured data including an intake platform controlled by feed back from a control platform. The intake platform includes an intake acquisition module for acquiring data building baseline data related to a domain and problem of interest, an intake pre-processing module, an intake language module, an intake application descriptors module, and an intake adjudication module. The control platform includes a control data acquisition module, a control data consistency collator, a control auditor, a control event definition and policy repository, an error resolver, and an output that outputs results of the workflow into structured data enabled to be used in data analytics.	09-02-2010
20100223292	HOLISTIC DISAMBIGUATION FOR ENTITY NAME SPOTTING - A method resolves ambiguous spotted entity names in a data corpus by determining an activation level value for each of a plurality of nodes corresponding to a single ambiguous entity name. The activation levels for each of the nodes may be modified by inputting outside domain knowledge corresponding to the nodes to increase the activation value of the nodes, spotting entity names corresponding to the nodes to increase the activation value of the nodes, searching the data corpus to spot newly posted entity names to increase the activation value of the nodes, and searching the data corpus to reduce or deactivate the activation value of the nodes by eliminating false positives. The ambiguous entity name is assigned to the node determined to have the highest activation level and is then outputted to a user.	09-02-2010
20110113016	Method and Apparatus for Data Compression - A method, system, and article for compressing an input stream of uncompressed data. The input stream is divided into one or more data segments. A hash is applied to a first data segment, and an offset and length are associated with this first segment. This hash, together with the offset and length data for the first segment, is stored in a hash table. Thereafter, a subsequent segment within the input stream is evaluated and compared with all other hash entries in the hash table, and a reference is written to a prior hash for an identified duplicate segment. The reference includes a new offset location for the subsequent segment. Similarly, a new hash is applied to an identified non-duplicate segment, with the new hash and its corresponding offset stored in the hash table. A compressed output stream of data is created from the hash table retained on storage media.	05-12-2011
20110113049	Anonymization of Unstructured Data - A method for anonymization of unstructured data comprises determining structured references in the unstructured data; populating a table with the structured references; anonymizing the structured references in the table using ontological analysis; and rewriting the structured references in the unstructured data with the anonymized structured references from the table to produce anonymized data. A system for anonymizing unstructured data comprises an entity spotting module configured to determine structured references in the unstructured data and populate a table with the determined structured references; an anonymization module configured to anonymizing the structured references in the table using ontological analysis; and a replacement module configured to rewrite the structured references in the unstructured data with the anonymized structured references from the table to produce anonymized data.	05-12-2011
20110185149	DATA DEDUPLICATION FOR STREAMING SEQUENTIAL DATA STORAGE APPLICATIONS - Data deduplication compression in a streaming storage application, is provided. The disclosed deduplication process provides a deduplication archive that enables storage of the archive to, and extraction from, a streaming storage medium. One implementation involves compressing fully sequential data stored in a data repository to a sequential streaming storage, by: splitting fully sequential data into data blocks; hashing content of each data block and comparing each hash to an in-memory lookup table for a match, the in-memory lookup table storing all hashes that have been encountered during the compression of the fully sequential data; for each data block without a hash match, adding the data block as a new data block for compression of fully sequential data; and encoding duplicate data blocks using the in-memory lookup table into data segments.	07-28-2011
20120197848	VALIDATION OF INGESTED DATA - Methods and systems for validating ingested data are disclosed. In accordance with the methods and systems, data elements can be received for storage in slots of an individual descriptor in a storage medium. In addition, at least one validation test can be selected based on a weighting of the data elements that indicates a respective degree of importance of the data elements. The selected validation test or tests can be applied to the data elements stored in the slots to generate respective validation results. Further, a validation score indicating a sufficiency of the stored data elements can be generated based on the validation results.	08-02-2012
20120197902	DATA INGEST OPTIMIZATION - Methods and systems for optimizing the retrieval of data from multiple sources are described. A slot map including slots for the storage of data elements can be obtained. The data elements associated with the slots can be prioritized by weighting values with costs of retrieving the data elements from respective data sources. Each value can be associated with a different data element and can indicate a respective degree of importance of the associated data element. Further, the systems and methods can direct the retrieval of data elements from the respective data sources in an order in accordance with the priority of the data elements to optimize the quality of data obtainable within a critical time constraint. In addition, the retrieved data elements can be stored in corresponding slots on a storage medium.	08-02-2012
20120330901	VALIDATION OF INGESTED DATA - Methods and systems for validating ingested data are disclosed. In accordance with the methods and systems, data elements can be received for storage in slots of an individual descriptor in a storage medium. In addition, at least one validation test can be selected based on a weighting of the data elements that indicates a respective degree of importance of the data elements. The selected validation test or tests can be applied to the data elements stored in the slots to generate respective validation results. Further, a validation score indicating a sufficiency of the stored data elements can be generated based on the validation results.	12-27-2012
20120330972	DATA INGEST OPTIMIZATION - Methods and systems for optimizing the retrieval of data from multiple sources are described. A slot map including slots for the storage of data elements can be obtained. The data elements associated with the slots can be prioritized by weighting values with costs of retrieving the data elements from respective data sources. Each value can be associated with a different data element and can indicate a respective degree of importance of the associated data element. Further, the systems and methods can direct the retrieval of data elements from the respective data sources in an order in accordance with the priority of the data elements to optimize the quality of data obtainable within a critical time constraint. In addition, the retrieved data elements can be stored in corresponding slots on a storage medium.	12-27-2012
20140359625	DETECTION AND CORRECTION OF RACE CONDITIONS IN WORKFLOWS - A race condition in a workflow representation is detected and corrected. First and second contracts are retrieved for respective first and second analytics of the workflow representation, wherein the contracts specify input types and output types of their analytics. Both contracts include information required to execute their respective analytics by a workflow executor. It is determined that the output type of the first analytic matches the input type of the second analytic based on a comparison of the first contract and the second contract, and that the workflow representation does not include a directed edge connecting the first analytic to the second analytic. The inclusion of a directed edge in the workflow representation connecting the first analytic to the second analytic will correct the race condition in the workflow representation.	12-04-2014

Patent applications by Daniel F. Gruhl, San Jose, CA US

Daniel Frederick Gruhl, San Jose, CA US

Patent application number	Description	Published
20080215585	SYSTEM AND METHOD FOR CREATION, REPRESENTATION, AND DELIVERY OF DOCUMENT CORPUS ENTITY CO-OCCURRENCE INFORMATION - To respond to queries that relate to co-occurring entities on the Web, a compact sparse matrix representing entity co-occurrences is generated and then accessed to satisfy queries. The sparse matrix has groups of sub-rows, with each group corresponding to an entity in a document corpus. The groups are sorted from most occurring entity to least occurring entity. Each sub-row within a group corresponds to an entity that co-occurs in the document corpus, within a co-occurrence criterion, with the entity represented by the group, and to facilitate query response the sub-rows within a group are sorted from most occurring co-occurrence to least occurring co-occurrence.	09-04-2008
20080222146	SYSTEM AND METHOD FOR CREATION, REPRESENTATION, AND DELIVERY OF DOCUMENT CORPUS ENTITY CO-OCCURRENCE INFORMATION - To respond to queries that relate to co-occurring entities on the Web, a compact sparse matrix representing entity co-occurrences is generated and then accessed to satisfy queries. The sparse matrix has groups of sub-rows, with each group corresponding to an entity in a document corpus. The groups are sorted from most occurring entity to least occurring entity. Each sub-row within a group corresponds to an entity that co-occurs in the document corpus, within a co-occurrence criterion, with the entity represented by the group, and to facilitate query response the sub-rows within a group are sorted from most occurring co-occurrence to least occurring co-occurrence.	09-11-2008
20080222723	MONITORING AND CONTROLLING APPLICATIONS EXECUTING IN A COMPUTING NODE - A method and system for monitoring and controlling applications executing on computing nodes of a computing system. A status request process, one or more control processes, an untrusted application and one other application are executed on a computing node. The status request process receives and processes requests for the statuses of the untrusted and the other application. A first control process controls the execution of the untrusted application. A second control process controls the execution of the other application. The execution of the untrusted application terminates based on a failure of the untrusted application. A capability of the status request process to receive and process the requests for statuses, and a capability of the second control process to control the execution of the other application are preserved in response to the termination of the untrusted application.	09-11-2008
20080307326	SYSTEM, METHOD, AND SERVICE FOR INDUCING A PATTERN OF COMMUNICATION AMONG VARIOUS PARTIES - A communication pattern inducing system focuses on the propagation of topics amongst a plurality of nodes based on the text of the node rather than hyperlinks of the node. A node could represent a weblog or any other source of information such as person, a conversation, images, etc. The system utilizes a model for information diffusion, wherein the parameters of the model capture how a new topic spreads from node to node. The system further comprises a process to learn the parameters of the model based on real data and to apply the process to real (or synthetic) node data. Consequently, the system is able to identify particular individuals that are highly effective at contributing to the spread of topics.	12-11-2008
20090192784	SYSTEMS AND METHODS FOR ANALYZING ELECTRONIC DOCUMENTS TO DISCOVER NONCOMPLIANCE WITH ESTABLISHED NORMS - A computer-implemented method for analyzing documents to discover noncompliance with an established norm is provided. The method can include receiving one or more terms indicating possible noncompliance with a pre-established norm, and, based upon the at least one term, constructing at least one grammatical unit. The grammatical unit can specify a predetermined syntax and can correspond to semantic content that is indicative of noncompliance with the pre-established norm, wherein the norm can include a statute, regulation, policy, or other standard. The method can further include identifying from among multiple electronic documents each document that contains one or more grammatical units specifying a predetermined syntax and corresponding to semantic content indicative of noncompliance with the pre-established norm.	07-30-2009
20090248614	SYSTEM AND METHOD FOR CONSTRUCTING TARGETED RANKING FROM MULTIPLE INFORMATION SOURCES - Embodiments of the invention provide a system and method for determining preferences from information mashups and, in particular, a system and method for constructing a ranked list from multiple sources. In an exemplary embodiment, the system and method tunably combines multiple ranked lists by computing a score for each item within the list, wherein the score is a function of the associated rank of the item within the list. In one exemplary embodiment, the function is equal to 1/(n̂(1/p)), where p is a tuning parameter that enables selection between responsiveness in the combined ranking to one candidate ranked highly in one source versus responsiveness in the combined ranking to a candidate with lower but broader support among the various sources ranking the candidates.	10-01-2009
20090248690	SYSTEM AND METHOD FOR DETERMINING PREFERENCES FROM INFORMATION MASHUPS - A system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information based on a social welfare function is disclosed. An exemplary embodiment of the invention uses a social welfare function (SWF) to identify a vote computing method from among a group of vote computing methods. The SWF embodies subjective values, e.g. business objectives. The embodiment uses the SWF to identify the vote computing method that combines cross-modality information into a single information mashup in a manner that is most congruent with the subjective values relative to the other vote computing methods. The information mashup may be in the form of a single, merged ranked list.	10-01-2009
20120259890	KNOWLEDGE-BASED DATA MINING SYSTEM - In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.	10-11-2012
20130054268	SYSTEMS AND METHODS FOR ABSTRACTING IMAGE AND VIDEO DATA - Systems and methods for removing or suppressing information in images and video frames is described herein. In particular, systems and methods provide for removing information from images capable of identifying individuals related to the image. For example, embodiments provide for the removal of protected health information (PHI) from source images, including medical images, video frames, and documents converted to images or video frames. In addition, embodiments operate on actual images and video frames as opposed to data extracted from such sources. In particular, embodiments provide for the creation of a PHI filter for an individual of interest comprised of identifying information. Images are filtered using the PHI filter and information potentially identifying the individual of interest is located and removed from the image.	02-28-2013
20130054625	AUTOMATED INFORMATION DISCOVERY AND TRACEABILITY FOR EVIDENCE GENERATION - Described herein are methods, systems, apparatuses and products for automated information discovery and traceability for evidence generation. An aspect provides for accessing a mapping of a plurality of connected nodes stored in a memory device, said mapping being discovered via a network scan based on a seed set, said plurality of connected nodes storing a plurality of archived healthcare records; accessing content stored in a memory device and ingested from said plurality of connected nodes; and determining a longitudinal healthcare record from the mapping and the content ingested from said plurality of connected nodes. Other embodiments are disclosed.	02-28-2013

Patent applications by Daniel Frederick Gruhl, San Jose, CA US

Tim Gruhl, Palmdale, CA US

Patent application number	Description	Published
20150319850	Board Integrated Interconnect - In an embodiment, a method includes forming a printed circuit board by depositing a first plurality of layers and forming an interconnect integral to the printed circuit board by depositing a second plurality of layers on at least a portion of the first plurality of layers. The interconnect includes a stabilizing structure and a contact positioned within the stabilizing structure. The stabilizing structure includes a first material and the contact includes a second material that is different than the first material.	11-05-2015