Image Matching
The Image Matching service can identify duplicate or near-duplicate images. It calculates so called "visual hash" that should be the same or nearly the same for images that are only slightly modified: shift of colors (B/W), re-compression, change of resolution, noise etc.
The API follows the general rules of Ximilar API as described in Section First steps.
The API is a set of HTTP REST services accepting JSON-formatted documents using POST and returning JSON documents. The base URL for this service is:
https://api.ximilar.com/image_matching/v2/<method>
Overview of API Methods
The following methods are "stateless" - they work solely with the images passed in the request:
/v2/ping
-- test the service and get basic info about it/v2/visual_hash
-- get visual hash(es) for given image or images/v2/remove_duplicates
-- get a set of images and merge the ones that are duplicates or near-duplicates/v2/rank_images
-- get one "query" image and a set of "data" images rank the data images by hash-based similarity to the query image
The Image Matching service also provides an option to store information about your image database in a Ximilar collection and then match images with these stored images. The principle and the API are the same as for Photo & Product Similarity:
- see this documentation with API prefix
https://api.ximilar.com/image_matching/v2/
Parameters of API methods
The Ximilar Search API works with data records that represent a single image. It has the same format in all operations and also in the responses. It is a JSON record (map) with the following fields:
_url
-- URL with a PNG, JPG, or TIFF image file_base64
-- base64-encoded content of a PNG, JPG or TIFF image file- attribute -- a JSON representation of any attribute of the record; these attributes are returned by the method and can be used for identification of individual records within the answer. We typically use attribute
_id
as unique image ID.
Example of image records in field records
which is used by all API methods:
{
"records":
[
{
"_id": "1",
"_url": "https://yourdomain.com/images/product_image_321.jpg"
},
{
"_id": "2",
"_base64": "data:image/jpeg;base64,/9j/4A...."
}
]
}
Return Values
All API methods return:
- HTTP error code
2XX
, if the method was OK and other HTTP error code, if the method failed - JSON-formatted body with the status, answer and statistics
Answer fields common for all types of answers:
statistics
-- a map of various statistics about the processing. The only statistic included every time isprocessing time
-- time of actual processing of the query (in seconds)
status
-- a JSON map with a status of the method processing. It contains these subfields:code
-- a numeric code of the operation status; it follows the concept of HTTP status codes (2XX
,4XX
). Specific codes are described for each type of answer (or operation) (see below).text
-- a text describing the status codeerror_description
-- in case of the processing ended with error (codes4XX
), this field contains a detailed description of the error; this might include Java stack traces.
Generic statuses that can be returned by any operation:
"status": {"code": 200, "text": "OK"}
"status": {"code": 402, "text": "aborted by error", error_description="..."}
"status": {"code": 500, "text": "unknown error", "error_description": "..."}
Detailed Descriptions of API Methods
/v2/ping
Description: returns a basic information about the index
Example:
curl --request POST \
--url https://api.ximilar.com/image_matching/v2/ping \
--header 'authorization: Token __API_TOKEN__'
Returns:
{
"status": {
"code": 200,
"text": "OK"
},
"_service_info": {
"_name": "Image matching service",
"_info": "Get visual hashes, find (near-)duplicate images and rank them"
}
}
/v2/visual_hash
Description: get a visual hash (or several different types of hashes) for given image(s)
Parameters:
records
: list of photos to get hashes for- must contain either of
_url
or_base64
field - see section image data for details
- must contain either of
hash_type
: determine type of visual hash that is used for computing, (default bothbmh1
andphash
are computed)
Example:
curl --request POST \
--url https://api.ximilar.com/image_matching/v2/visual_hash \
--header 'authorization: Token __API_TOKEN__' \
--header 'content-type: application/json' \
--data '{
"records": [
{"_url": "https://images.ximilar.com/examples/fashion_products/10073009-HERO.jpeg"}
]
}'
Returns:
{
"records": [
{
"_url": "https://images.ximilar.com/examples/fashion_products/10073009-HERO.jpeg",
"_width": 400,
"_height": 400,
"bmh1": "11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001111111111111111111111111111110000111111111111111101111111111110001111111101111111000000001111110011111111001111110000000011111110111111110001111100000000111111111111111100001111000000001111111111111111000001111000000011111111111111110000001110000000111111111111111100000001100000001111111101111111000000001000000011111111001111110000000010000000111111110001111100000000100000001111111100001111000000001100000011111111000001110000000000000000111111110000001100000000000000001100000000000001000000000000000011000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001001100000000000001110000000000011111111000000000111111110000000001",
"phash": "1111110000001011010000111110110010111001000101100010010110001010"
}
],
"statistics": {
"processing time": 0.13515067100524902
},
"status": {
"code": 200,
"text": "OK",
}
}
/v2/remove_duplicates
Description: merge the images/records that are matching (based on visual hashes).
Things to note
- Be aware that when two or more records are matched then they are moved to the field
removed_records
of the first record. See return value as example.
Parameters:
records
: list of photos to merge- must contain either of
_url
or_base64
field - see section image data for details
- must contain either of
hash_type
: determine type of visual hash that is used for comparision, (defaultbmh1
, optionalphash
)range
: specify the minimum threshold value that is used for clustering for removing duplicates (default 0)
Example:
curl --request POST \
--url https://api.ximilar.com/image_matching/v2/remove_duplicates \
--header 'authorization: Token __API_TOKEN__' \
--header 'content-type: application/json' \
--data '{
"records": [
{"_url": "__URL_PATH_1__", "_id": 1}, {"_url": "__URL_PATH_1__", "_id": 2}, {"_url": "__URL_PATH_2__", "_id": 3}
]
}'
Returns:
{
"records": [
{
"_url": "__URL_PATH_1__",
"_status": {
"code": 200,
"text": "OK",
"request_id": "7a1cf0ee-a2dd-4fa2-9927-69441bc1d3dc"
},
"_id": "1",
"_width": 259,
"_height": 460,
"removed_records": [
{
"_url": "__URL_PATH_1__",
"_status": {
"code": 200,
"text": "OK",
"request_id": "7a1cf0ee-a2dd-4fa2-9927-69441bc1d3dc"
},
"_id": "2",
"_width": 259,
"_height": 460
}
]
},
{
"_url": "__URL_PATH_2__",
"_status": {
"code": 200,
"text": "OK",
"request_id": "7a1cf0ee-a2dd-4fa2-9927-69441bc1d3dc"
},
"_id": "3",
"_width": 212,
"_height": 289
}
],
"status": {
"code": 200,
"text": "OK",
"request_id": "7a1cf0ee-a2dd-4fa2-9927-69441bc1d3dc",
"proc_id": "d9ac827a-ee8e-4e3b-a5d2-665c10e3fa84"
},
"statistics": {
"processing time": 0.7242739200592041
}
}
/v2/rank_images
Description: rank the images/records against the query image, based on image hash.
Things to note
- Be aware that request of method
/v2/rank_images
containsrecords
andquery_record
field and returnsquery_records
andanswer_records
. This is because we are imitate ranking endpoint of Photo and Product similarity service.
Parameters:
query_record
: a record/image that you want to compare against recordsrecords
: list of images to rank- must contain either of
_url
or_base64
field - see section image data for details
- must contain either of
hash_type
: determine type of visual hash that is used for comparision, (defaultbmh1
, optionalphash
)
Example:
curl --request POST \
--url https://api.ximilar.com/image_matching/v2/rank_images \
--header 'authorization: Token __API_TOKEN__' \
--header 'content-type: application/json' \
--data '{
"query_record": {
"_url": "__URL_PATH_1__"
},
"records": [
{
"_url": "__URL_PATH_1__"
},
{
"_url": "__URL_PATH_2__"
},
{
"_url": "__URL_PATH_3__"
},
{
"_url": "__URL_PATH_4_NOT_WORKING__"
}
]
}'
Returns:
- HTTP error code 2XX, if the method was OK and other HTTP error code, if the method failed.
- Body of the response is a JSON object (map) with the following fields:
status
- a JSON map with a status of the method processing. It contains these subfields:code
- a numeric code of the operation status; it follows the concept of HTTP status codes (2XX, 4XX). Specific codes are described for each type of answer (or operation) (see below).text
- a text describing the status code
statistics
- a map of various statistics about the processing. The only statistic included every time isprocessing time
- time of actual processing of the query [in seconds]
query_records
- a record/image that you compared against records (returned as array with one record)answer_records
- sorted (the first is most matching image and the last is the least one) array of recordsanswer_distances
- array of distance values that correspond with answer_records array, lower the value the closer it is to the query recordskipped_records
- if some record fails with analysis (most common is due to wrong image url), then the record will be present here
{
"query_records": [
{
"_url": "__URL_PATH_1__",
"_width": 259,
"_height": 460
}
],
"answer_distances": [
0.0,
0.1,
26.0
],
"answer_records": [
{
"_url": "__URL_PATH_1__",
"_id": "e7ee2a82-495f-4df7-adc3-5cdb2b5fadf7",
"_width": 259,
"_height": 460
},
{
"_url": "__URL_PATH_2__",
"_id": "b170de44-75e2-4c4c-a41a-ff1ed9fa84b6",
"_width": 259,
"_height": 460
},
{
"_url": "__URL_PATH_3__",
"test": "insomnia",
"_id": "ee72be0b-ab93-4696-9689-7f866ea9bb38",
"_width": 212,
"_height": 289
}
],
"skipped_records": [
{
"_url": "__URL_PATH_4_NOT_WORKING__",
"_status": {
"code": 400,
"text": "Error Loading Image: Unable to download image from '__URL_PATH_4_NOT_WORKING__', Attempts: 3",
"request_id": "a798e682-2b89-49ea-bb6b-0c2d09d523c1"
},
"_id": "dccb8eab-c4da-4fc6-b24a-1d92fe96f75e"
}
],
"status": {
"code": 300,
"text": "MIXED_RESULT",
"request_id": "a798e682-2b89-49ea-bb6b-0c2d09d523c1",
"proc_id": "19ef9e49-0c6b-4dca-a52a-80726da178ed"
},
"statistics": {
"processing time": 0.9733231067657471
}
}