What are you looking for ?
Advertise with us
RAIDON

Cloudera Assigned Patent

Sampling large data sets in distributed storage

Cloudera, Inc., Palo Alto, CA, has been assigned a patent (10,866,874) developed by Ahmadian, Shaun, San Jose, CA, and Thomas, Sushil, San Francisco, CA, for apparatus and method for sampling large data sets in a distributed data storage.

The abstract of the patent published by the U.S. Patent and Trademark Office states: A system includes a distributed data storage system disseminated across worker machines connected by a network. A distributed data storage management module has instructions executed by a processor to utilize data block identifiers to track data block accesses to the distributed data storage system. A sampling module with instructions executed by the processor receives a new sample request from a client machine connected to the network. Initial data block samples are gathered from the distributed data storage system during a first time period. A revised sample request is received from the client machine during the first time period. The initial data block samples are gathered. New data block samples are collected from the distributed data storage system. The initial data block samples and the new data block samples are combined to form cumulative data block sample results. The cumulative data block sample results are supplied to the client machine.

The patent application was filed on June 27, 2019 (16/455,026).

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E