Topcat's Queries

Brian Ritchie, 2019-07-17/18.

In a previous document I described the mechanisms that Topcat uses to build the queries that it sends to ICAT (via the entityManager interface). That document concentrates on the construction process, but only partly documents what queries Topcat actually constructs, and when it sends them. This document attempts to complete the picture. There is necessarily some overlap between the two documents.

To determine what queries Topcat builds, I started at the bottom, by looking at places where the Javascript submits requests to the ICAT entityManager interface; and then working upwards. So that's the order I'm going to present things here.

(Nonetheless, the example queries are obtained by observation of entityManager requests in the browser console rather than calculation! This is easier than trying to determine what else the queryBuilder code might do to the initial queries.)

Direct entityManager references

There are only two, both in tc-icat.service:

  this.query = helpers.overload({
    ...
            	'array, object': function(query, options){    	
    	        	var defered = $q.defer();
                    var query = helpers.buildQuery(query);
                    var key = "query:" + query;

    	        	this.cache().getPromise(key, 10 * 60 * 60, function(){
                        return that.get('entityManager', {
                            sessionId: that.session().sessionId,
                            query: query,
                            server: facility.config().icatUrl
                        }, options);
                    }).then(function(results){
    ...

All of Topcat's (Javascript-side) queries go through here. Note the use of this.cache.getPromise(key,...); this means that all queries are cached (in Topcat's browser-side cache), keyed by the final query string that is generated by helpers.buildQuery() (see the previous document for a description of that). This doesn't necessarily mean that each distinct query only gets passed to ICAT once, though: asynchrony of promises could mean that multiple ICAT requests are sent for the same query before any of them has a chance to be cached. (I've seen this happen in getSize requests, but haven't confirmed whether it happens for queries in practice.)

The other request to the entityManager isn't about queries, but about entity creation:

            this.write = helpers.overload({
                /**
                 * Creates or updates entities for this Icat.
                 * 
                 * @method
                 * @name Icat#write
                 * @param  {object[]} entities an array of entities to be written to this Icat
                 * @param  {object} options {@link https://docs.angularjs.org/api/ng/service/$http#usage|as specified in the Angular documentation}
                 * @return {Promise<number[]>} a deferred array of entity ids that have been created
                 */
                'array, object': function(entities, options){
                    return this.post('entityManager', {
                        sessionId: this.session().sessionId,
                        entities: JSON.stringify(entities)
                    }, options);
                },

I believe this is only used by Topcat's upload mechanism to create a Dataset if required, which is a separate story.

icat-query-builder.service

This was covered in detail in the previous document - see there for details of the queries that it constructs. Its .run(), .count(), .min() and .max() functions all use icat.query() to run the constructed query.

This service is used by tc-icat.service's queryBuilder() function, which is used by numerous other components.

References to icat.query() and icat.queryBuilder()

Next I looked for places where the query() method above is used, as well as queryBuilder(). There is quite a bit of overlap here with the original document, but there the focus was on .where() constructions (and some examples of ad-hoc query construction); here I will try to fill in the remaining details.

These are organised here by the modules (mainly controllers and services) in which they appear.

breadcrumb.controller

In a promise used by the controller constructor:

    // specific proposal case:
    facility.icat().query(["select investigation from Investigation investigation where investigation.name = ?", entityId, "limit 0, 1"], 
        timeout).then(function(entities){...

    // general case:
    facility.icat().query(["select ? from ? ? where ?.id = ?", entityType.safe(), helpers.capitalize(entityType).safe(), 
                          entityType.safe(), entityType.safe(), entityId, "limit 0, 1"], timeout)
                   .then(function(entities){ ...

This is one of many places where Topcat has dedicated code to handle the "proposal" (pseudo-)entity required by DLS; here it substitutes "investigation" for "proposal" (in other places the treatment is more complex). The upshot here is that Topcat queries ICAT for a specific entity by its id. I'm not exactly sure when the query gets sent, because the calls to .query() above are added to a list of promises, within a function that also returns a promise; and the function itself is then called by some twiddly Javascript in the controller function:

    timeout.promise.then(function(){ $timeout.cancel(breadcrumbPromise); });

Example queries

Observed when drilling down to datasets in LILS:

select investigation 
from Investigation investigation 
where investigation.id = "4" 
limit 0, 1

select investigation 
from Investigation investigation 
where investigation.name = "Proposal 1" 
limit 0, 1

browse-entities.controller

The function generateQueryBuilder(entityType) uses the grid options from the configuration to construct queries to populate the rows. The where clause constructions were covered in the previous document, but not the inclusions or ordering.

    var out = icat.queryBuilder(entityType);

            _.each(gridOptions.columnDefs, function(columnDef){
                if(!columnDef.field) return;

The .where constructions are added here, depending on columnDef.filters. As described before, the full set of clauses that may be added are:

      out.where(["facility.id = ?", facilityId]);
      out.where(["investigation.name = ?", id]);
      out.where(["?.id = ?", variableName.safe(), parseInt(id || "-1")]);
      out.where("investigationInstrument.id = instrument.id")
      out.where([
                            "? between {ts ?} and {ts ?}",
                            columnDef.jpqlFilter.safe(),
                            from.safe(),
                            to.safe()
                        ]);
      out.where([
                            "? between ? and ?",
                            columnDef.jpqlFilter.safe(),
                            from,
                            to
                        ]);
      out.where("datafileParameterType.name = 'run_number'")
      out.where([
                        "UPPER(?) like concat('%', ?, '%')", 
                        columnDef.jpqlFilter.safe(),
                        columnDef.filter.term.toUpperCase()
                    ]);

If not configured explicitly in topcat.json, columnDef.jpqlFilter defaults to 'entityType.field' (e.g. 'dataset.name'). Note that string.safe() doesn't process the string value, but marks it as 'safe', so subsequent processing will not add quotes or escape characters.

If the column's field includes an expression of the form entityType.member, then entityType is added to the query's include list (remember, subsequent processing may add further includes based on the where clauses):

    if(columnDef.field.match(/\./)){
        var entityType =  columnDef.field.replace(/\[([^\.=>\[\]\s]+)/, function(match){ 
            return helpers.capitalize(match.replace(/^\[/, ''));
        }).replace(/^([^\.\[]+).*$/, '$1');
        out.include(entityType);
    }

Order-by clauses are added based on the column's sort configuration:

_.each(sortColumns, function(sortColumn){
    if(sortColumn.colDef){
        out.orderBy(sortColumn.colDef.jpqlSort, sortColumn.sort.direction);
    }
});

Finally, any defined externalGridFilters are given the opportunity to modify the query:

            _.each(externalGridFilters, function(externalGridFilter){
                externalGridFilter.modifyQuery.apply(that, [out]);
            });

This last part is somewhat open-ended. The externalGridFilters are defined in tc-ui.service (see below for slightly more detail). I have yet to investigate further, but I believe that the idea behind externalGridFilters is to allow plugins to add features to the grid views.

In turn, generateQueryBuilder() is used in several functions in browse-entities.controller:

getPage() returns gQB().limit(...).run().then(fn(entities){...})
updateTotalItems() returns gQB().count(...).then(fn(totalItems){...})
selectAll(): if isAncestorInCart() then does gQB().run().then(fn(entities){...})
unselectAll(): same

.limit() adds a limit clause to the query; .count() modifies the query to return a count, then runs it; .run() runs the query as-is.

getPage() and updateTotalItems() are called from many places:

Whenever the page is refreshed in the browser:

        $scope.$on('global:refresh', function(){
            page = 1;
            getPage().then(function(results){
                gridOptions.data = results;
                updateTotalItems();
                updateSelections();
                updateScroll(results.length);
            });
        });

In the API registration function (I don't know when this is called); note that as well as performing an initial getPage() it also defines event actions (when sorting or filter options are changed) that call getPage():

        gridOptions.onRegisterApi = function(_gridApi) {
            gridApi = _gridApi;
            restoreState();

            getPage().then(function(results){
                gridOptions.data = results;
                updateTotalItems();
                updateSelections();
                updateScroll(results.length);
            });

            gridApi.core.on.sortChanged($scope, function(grid, _sortColumns){
                sortColumns = _sortColumns;
                page = 1;
                getPage().then(function(results){
                    updateScroll(results.length);
                    gridOptions.data = results;
                    updateSelections();
                    saveState();
                });
            });

            gridApi.core.on.filterChanged($scope, function(){
                canceler.resolve();
                canceler = $q.defer();
                page = 1;
                gridOptions.data = [];
                getPage().then(function(results){
                    gridOptions.data = results;
                    updateSelections();
                    updateScroll(results.length);
                    updateTotalItems();
                    saveState();
                });
            });

Later in the same method, callbacks to handle scrolling or pagination are defined that also use getPage():

            if(isScroll){
                //scroll down more data callback (append data)
                gridApi.infiniteScroll.on.needLoadMoreData($scope, function() {
                    page++;
                    getPage().then(function(results){
                        _.each(results, function(result){ gridOptions.data.push(result); });
                        if(results.length == 0) page--;
                        updateSelections();
                        updateScroll(results.length);
                    });
                });

                //scoll up more data at top callback (prepend data)
                gridApi.infiniteScroll.on.needLoadMoreDataTop($scope, function() {
                    page--;
                    getPage().then(function(results){
                        _.each(results.reverse(), function(result){ gridOptions.data.unshift(result); });
                        if(results.length == 0) page++;
                        updateSelections();
                        updateScroll(results.length);
                    });
                });
            } else {
                //pagination callback
                gridApi.pagination.on.paginationChanged($scope, function(_page, _pageSize) {
                    page = _page;
                    pageSize = _pageSize;
                    getPage().then(function(results){
                        gridOptions.data = results;
                        updateSelections();
                    });
                });
            }

Example queries

Observed when drilling down to datasets in LILS: this query is generated by getPage():

select distinct dataset 
from Dataset dataset , 
     dataset.investigation as investigation , 
     investigation.facility as facility 
where facility.id = 1 
  and investigation.name = "Proposal 1" 
  and investigation.id = 4 
limit 0, 50

and this query by updateTotalItems():

select count(distinct dataset) 
from Dataset dataset , 
     dataset.investigation as investigation , 
     investigation.facility as facility 
where facility.id = 1 
  and investigation.name = "Proposal 1" 
  and investigation.id = 4

Changing a sort order on a column results in a new getPage() request:

select distinct dataset 
from Dataset dataset , 
     dataset.investigation as investigation , 
     investigation.facility as facility 
where facility.id = 1 
  and investigation.name = "Proposal 1" 
  and investigation.id = 4 
ORDER BY dataset.createTime asc 
limit 0, 50

An example of filtering (by location) on datafiles:

select distinct datafile 
from Datafile datafile , 
     datafile.dataset as dataset , 
     dataset.investigation as investigation , 
     investigation.facility as facility 
where facility.id = 1 
  and investigation.name = "Proposal 1" 
  and investigation.id = 4 
  and dataset.id = 10 
  and UPPER(datafile.location) like concat("%", "VERO", "%") 
limit 0, 50

browse-facilities.controller

Here, a relatively straightforward query is used to retrieve (the details of) each facility to which (I believe) the user is currently logged-in:

        _.each(tc.userFacilities(), function(facility){
            facility.icat().query(["select facility from Facility facility where facility.id = ?", facility.config().id]).then(function(facilities){
                facilities[0].facilityName = facility.config().name;
                gridOptions.data.push(facilities[0]);
            });
        });

doi-redirect.controller

A simple query to retrieve an entity using the entityType and entityId supplied as parameters to the controller:

    tc.icat(facilityName).query(["select ? from ? ? where ?.id = ", 
        entityType.safe(), helpers.capitalize(entityType).safe(), 
        entityType.safe(),entityType.safe(), entityId])
    .then(function(entities){ ...

In the normal case, the code then opens a browse view on the retrieved entity.

meta-panel.controller

This defines a rowclick event handler that constructs a query to retrieve the metatabs data for the row. As described in the previous document, its use of the queryBuilder is slightly different from elsewhere, in that it generates a hard-wired set of inclusions based on the entity type:

    var queryBuilder = 
        facility.icat().queryBuilder(entity.type)
                       .where(entity.type + ".id = " + entity.id);

    if(entity.type == 'instrument'){
        queryBuilder.include('instrumentScientist');
    }

    if(entity.type == 'investigation'){
        queryBuilder.include('user');
        queryBuilder.include('investigationParameterType');
        queryBuilder.include('sample');
        queryBuilder.include('publication');
        queryBuilder.include('study');
        queryBuilder.include('investigationUser');
        queryBuilder.include('studyInvestigation');
    }

    if(entity.type == 'dataset'){
        queryBuilder.include('datasetParameterType');
        queryBuilder.include('sample');
        queryBuilder.include('datasetType');
    }

    if(entity.type == 'datafile'){
        queryBuilder.include('datafileParameterType');
        queryBuilder.include('datafileFormat');
    }
            
    queryBuilder.run(timeout.promise).then(function(entity){

Note that the argument to .include() is a variable name; the queryBuilder code determines the "variablePath" to reach the corresponding entity from the current entity, and adds this to the final query, but only if the variable's entity is not already included by some other route. This means that the generated queries can look quite different to what seems to be implied above.

I now believe that the use of .include("sample") for Investigation and Dataset is incorrect, and it should be .include("investigationSample") and .include("datasetSample") respectively. See the example of calculated variablePaths for datasets; this suggests that the variableName in the dataset case should be datasetSample. See this Topcat issue for more details.

Example meta-panel queries

Meta-panel query for a "proposal" in LILS:

select distinct investigation
from Investigation investigation
where investigation.id = 5
include
    investigation.parameters.type,
    investigation.publications,
    investigation.studyInvestigations.study,
    investigation.investigationUsers.user

Meta-panel query for a dataset:

select distinct dataset from Dataset dataset
where dataset.id = 13
include dataset.parameters.type, dataset.type

I am not sure why "sample" is missing from these - but see my suspicion above.

For a datafile:

select distinct datafile from Datafile datafile
where datafile.id = 1201
include datafile.parameters.type, datafile.datafileFormat

my-data.controller

The code here is very similar to browse-entities.controller; there is a generateQueryBuilder method that is largely the same. Here, the potential where clauses are:

      out.where(["facility.id = ?", facilityId]);
      out.where(["investigation.name = ?", id]);
      out.where(["?.id = ?", variableName.safe(), parseInt(id || "-1")]);
      out.where("investigationInstrument.id = instrument.id")
      out.where([
                            "? between {ts ?} and {ts ?}",
                            columnDef.jpqlFilter.safe(),
                            from.safe(),
                            to.safe()
                        ]);
      out.where([
                            "? between ? and ?",
                            columnDef.jpqlFilter.safe(),
                            from,
                            to
                        ]);
      out.where("datafileParameterType.name = 'run_number'")
      out.where([
                        "UPPER(?) like concat('%', ?, '%')", 
                        columnDef.jpqlFilter.safe(),
                        columnDef.filter.term.toUpperCase()
                    ]);

generateQueryBuilder is used in getPage(), with .limit() used to restrict results to the current page-worth:

    function getPage(){
        that.isLoading = true;
        return generateQueryBuilder().limit((page - 1) * pageSize, pageSize).run(canceler.promise).then(function(entities){
            that.isLoading = false;
            _.each(entities, function(entity){

At this point, note that entity.getDatasetCount() and getDatafileCount() will also generate queries (entity.getSize() is ultimately handled by the IDS, not by the ICAT entityManager):


    if(isSizeColumnDef && entity.getSize){
        entity.getSize(canceler.promise);
    }
    if(isDatafileCountColumnDef && entity.getDatafileCount){
        entity.getDatafileCount(canceler.promise);
    }
    if(isDatasetCountColumnDef && entity.getDatasetCount){
        entity.getDatasetCount(canceler.promise);
    }

One difference from browse-entities.controller is the construction of further queries that (in some cases) retrieve and add extra minBlah / maxBlah properties to an entity (for some "field suffix" Blah). Note the "todo" comment that says this is a hack for ISIS!

            _.each(gridOptions.columnDefs, function(columnDef){
                //todo: this is a hack for ISIS - refactor to make more generic
                if(columnDef.type == 'number' && columnDef.filters){
                    var pair = columnDef.jpqlFilter.split(/\./);
                    var entityType = pair[0];
                    var entityField = pair[1];
                    var fieldNameSuffix = helpers.capitalize(entityType) + helpers.capitalize(entityField);

                    icat.queryBuilder(entityType).where([
                        "investigation.id = ?", entity.id,
                        "and datafileParameterType.name = 'run_number'"
                    ]).min('numericValue', canceler.promise).then(function(min){
                        entity['min' + fieldNameSuffix] = min;
                    });

                    icat.queryBuilder('datafileParameter').where([
                        "investigation.id = ?", entity.id,
                        "and datafileParameterType.name = 'run_number'"
                    ]).max('numericValue', canceler.promise).then(function(max){
                        entity['max' + fieldNameSuffix] = max;
                    });
                }
            });

        });
        return entities;
    });
}

I suspect this section of code assumes that the my-data entityType is Investigation. The queries try to obtain the minimum and maximum values of run__number for the current investigation and then set corresponding properties on the entity.

getPage() is used similarly to that in browse-entities.controller, though here there is no refresh handler defined.

search-parameter.controller

The controller constructor generates a query to retrieve the list of parameterTypes from each facility:

_.each(tc.userFacilities(), function(facility){
    facility.icat().query([
        "select parameterType from ParameterType parameterType",
        "where",
        "parameterType.applicableToInvestigation = true or",
        "parameterType.applicableToDataset = true or",
        "parameterType.applicableToDatafile = true",
        "include parameterType.permissibleStringValues"
    ]).then(function(parameterTypes){

tc-icat-entity.service

The getDatasetCount() function (defined when the entity is an Investigation) uses a query:

    var query = "select count(dataset) from Dataset dataset, dataset.investigation as investigation where investigation.id = ?";
    ...
    return icat.query([query, that.id], options)...

getDatafileCount() (defined for investigations and datasets) is similar but slightly more involved; the query is different for the two entity types, and the result is cached:

    var query;
    if(this.entityType == 'investigation'){
        query = "select count(datafile) from Datafile datafile, datafile.dataset as dataset, dataset.investigation as investigation where investigation.id = ?";
    } else {
        query = "select count(datafile) from Datafile datafile, datafile.dataset as dataset where dataset.id = ?";
    }

    var key = 'getDatafileCount:' + this.entityType + ":" + this.id;
    return icat.cache().getPromise(key, function(){ return icat.query([query, that.id], options); }).then(function(response){

getDatasetCount() and getDatafileCount() are used in browse-entities.controller and my-data.controller's getPage(), covered above. getDatafileCount() is used by tc-user-cart.service, and in turn by cart.controller.

The thisAndAncestors() member function also generates queries. The construction is based on the configured hierarchy in code that is somewhat convoluted (and as ever, undocumented!)

this.thisAndAncestors = function(){
	var hierarchy = _.clone(facility.config().hierarchy);
				
	hierarchy.shift();
	var investigationPosition = _.indexOf(hierarchy, 'investigation');
	var proposalPosition = _.indexOf(hierarchy, 'proposal');

	if(investigationPosition > -1 && proposalPosition > -1){
		hierarchy.splice(proposalPosition, 1);
	} else if(proposalPosition > -1){
		hierarchy[proposalPosition] = 'investigation';
	}

	var path = this.findPath(hierarchy) || [];
	var out = [];

	while(path.length > 0){
		if(path.pop() == this.entityType) break;
	}

	return parent(this, {});

The queries are generated by a recursive function, parent():

    function parent(entity, ids){
        out.push(entity);
        if(path.length == 0) return $q.resolve(out);
        ids[entity.entityType] = entity.id;
        var parentEntityName = path.pop();
        var query = parentQueries[entity.entityType][parentEntityName](ids);
        return icat.query(query).then(function(entities){
            return parent(entities[0], ids);
        });
    }

where parentQueries is defined by:

var parentQueries = {
    datafile: {
        dataset: function(ids){
            return [
                'select dataset from Datafile datafile, datafile.dataset as dataset',
                'where datafile.id = ?', ids.datafile
            ];
        }
    },
    dataset: {
        investigation: function(ids){
            return [
                'select investigation from Dataset dataset, dataset.investigation as investigation',
                'where dataset.id = ?', ids.dataset
            ];
        }
    },
    investigation: {
        instrument: function(ids){
            return [
                'select instrument from',
                'Investigation investigation,',
                'investigation.investigationInstruments as investigationInstrument,',
                'investigationInstrument.instrument as instrument',
                'where investigation.id = ?', ids.investigation 
            ];
        },
        facilityCycle: function(ids){
            return [
                'select facilityCycle from Investigation investigation,',
                'investigation.facility as facility,',
                'facility.facilityCycles as facilityCycle',
                'where investigation.startDate BETWEEN facilityCycle.startDate AND facilityCycle.endDate',
                'and investigation.id = ?', ids.investigation
            ]
        }
    },
    facilityCycle: {
        instrument: function(ids){
            return [
                'select instrument from',
                'Investigation investigation,',
                'investigation.investigationInstruments as investigationInstrument,',
                'investigationInstrument.instrument as instrument',
                'where investigation.id = ?', ids.investigation
            ];
        }
    }
};

Calculating the full set of possible queries that can arise from this in practice is left as an exercise for the reader! I infer from the context (as much from following the code) that thisAndAcestors() returns a list that contains the current entity and all its parent entities up to the ancestral Investigation; so the code must be generating a separate query for each ancestor (as well as for the entity itself).

thisAndAncestors() is used when setting up this.params, seemingly to extract the entityId and perhaps the proposalId for the entity and each of its ancestors:

this.stateParams = function(){
	if($state.current.name.match(/^home\.browse\.facility\./)){
		var out = _.clone($state.params);
		delete out.uiGridState;
		out[helpers.uncapitalize(this.entityType) + "Id"] = this.id;
		return $q.resolve(out);
	} else {
		return this.thisAndAncestors().then(function(thisAndAncestors){
			var out = {};
			_.each(thisAndAncestors, function(entity){
				out[entity.entityType + "Id"] = entity.id;
				if(entity.entityType == 'investigation') out['proposalId'] = entity.name;
			});
			return _.merge(out, {facilityName: facilityName});
		});
	}
};

this.stateParams() is called in turn by this.browse(), which is used in drill-down in browse-entities, browse-facilities, my-data and search-results, and in the doi-redirect controller.

tc-icat.service

As well as using the icat-query-builder service (to defined the queryBuilder function that is used in browse-entities etc.), this also uses .query in several places during the .login() method:

this.login():
    ...
    promises.push(that.query([
        "SELECT facility FROM Facility facility WHERE facility.name = ?", name
        // sets $sessionStorage.sessions[facilityName].facilityId
    ...
    if(idsUploadDatasetType){
        promises.push(that.query([
            "SELECT datasetType FROM DatasetType datasetType, datasetType.facility as facility", 
            "WHERE facility.name = ?", name,
            "AND datasetType.name = ?", idsUploadDatasetType
        // sets sessions[facilityName].idsUploadDatasetTypeId
    ...
    if(idsUploadDatafileFormat){
        promises.push(that.query([
            "SELECT datasetType FROM DatafileFormat datasetType, datasetType.facility as facility", 
            "WHERE facility.name = ?", name,
            "AND datasetType.name = ?", idsUploadDatafileFormat
        // sets sessions[facilityName].idsUploadDatafileFormatId
    ...
    promises.push(that.query(["select user from User user where user.name = ?", username]).then(function(users){
        // sets sessions[facilityName].fullName

tc-user-cart.service

The cart constructor adds an entity() function to each CartItem, which generates a query to retrieve the entity from ICAT:

    cartItem.entity = helpers.overload({
      ...
      return facility.icat().query([
                    "select ? from ? ? where ?.id = ?",
                    this.entityType.safe(),
                    helpers.capitalize(this.entityType).safe(),
                    this.entityType.safe(),
                    this.entityType.safe(),
                    this.entityId
                ], options).then(
                    return entities[0];
                });
            }, ...

cartItem.entity() is called from cartItem.getSize() and cartItem.getDatafileCount(); so both methods can result in queries being sent to the entityManager. cartItem.getSize() also uses entity.getSize(), which is passed to the IDS, not ICAT; but cartItem.getDatafileCount() sends a count() query to ICAT.

tc.service

The search() function performs an initial lucene query to retrieve a list of entity IDs. (Note that the search is cached using the lucene query as the key.)

this.search = helpers.overload({
	    'array, object, object': function(facilityNames, query, options){
    var defered = $q.defer();
    var promises = [];
    var results = [];
    query.target = query.target.replace(/^./, function(c){ return c.toUpperCase(); });
    var entityType = query.target;
    var entityInstanceName = helpers.uncapitalize(entityType);
    _.each(facilityNames, function(facilityName){
        var facility = tc.facility(facilityName);
        var icat = facility.icat();
        var key = "search:" + JSON.stringify(query);

        promises.push(icat.cache().getPromise(key, 10 * 60 * 60, function(){
            return icat.get('lucene/data', {
                sessionId: icat.session().sessionId,
                query: JSON.stringify(query),
                maxCount: 300
        }, options);
    }).then(function(data){
        if(data && data.length > 0){
            var ids = [];
            var scores = {};
            _.each(data, function(result){
                ids.push(result.id);
                scores[result.id] = result.score;
            });

Then it constructs queries (possibly multiple, due to chunking on the list of ids), each as a list that contains both strings and functions (perhaps the only place in Topcat where this happens). Each function returns a different query fragment depending on the entityType; the functions are not evaluated until very late in the query-building process.

var promises = [];
    _.each(_.chunk(ids, 100), function(ids){
        var query = [
            function(){
                if(entityType == 'Investigation'){
                    return 'select investigation from Investigation investigation';
                } else if(entityType == 'Dataset') {
                    return 'select dataset from Dataset dataset';
                } else {
                    return 'select datafile from Datafile datafile';
                }
	      },
            'where ?.id in (?)', entityInstanceName.safe(), ids.join(', ').safe(),
            function(){
                if(entityType == 'Investigation'){
                    return 'include investigation.investigationInstruments.instrument';
                } else if(entityType == 'Dataset'){
                    return 'include dataset.investigation';
                } else if(entityType == 'Datafile') {
                    return 'include datafile.dataset.investigation';
                }
            }
        ];
        promises.push(icat.query(query, options).then(function(_results){
            ...

I am not completely convinced that there is any real point to the use of deferred functions in the query list here; surely the evaluation could have been done here, so that the query list would just contain strings?

tc.search() is used in search-results.controller.

The query parameter passed to tc.search() is a structured object containing elements expected by ICAT's /lucene/data API, formed from the user inputs to the search dialog boxes. It is not SQL (or JPQL), and I will not discuss it further here.

search-results.controller

This uses tc.service's search() function, in the gridOptions API registration:

    gridOptions.onRegisterApi = function(_gridApi) {
        ...
        var query = _.merge(queryCommon, {target: type});
        var searchPromise = tc.search(facilities, timeout.promise, query);
        promises.push(searchPromise);

promises is a global variable; during construction there is a $q.all(promises). Though createGridOptions() (within which the onRegisterApi function is defined) is called first, I don't know whether gridOptions.onRegisterApi() will have been called yet; if so, then an initial call to tc.search() will be made.

Later in the registration code, the searchPromise is used in a getResults() function; note that this may also call entity.getDatasetCount() or entity.getDatafileCount(), which may generate further queries:

    function getResults(){
        function processResults(results){
            var out = _.select(results, filter);
            out.sort(sorter);
            _.each(out, function(entity){

                if(isSizeColumnDef && entity.getSize){
                    entity.getSize(timeout.promise);
                } 
                if(isDatafileCountColumnDef && entity.getDatafileCount) {
                    entity.getDatafileCount(timeout.promise);
                }
                if(isDatasetCountColumnDef && entity.getDatasetCount) {
                    entity.getDatasetCount(timeout.promise);
                }
            });
            return out;
        }
        return searchPromise.then(processResults, function(){}, processResults);
    }

Example Search query requests

Search in LILS (investigations, datasets, datafiles) with text 'omnis'; this generated three Lucene requests:

http://localhost:8080/icat/lucene/data?
sessionId=...
query={"text":"omnis","target":"Investigation"}
maxCount=300

http://localhost:8080/icat/lucene/data?
sessionId=...
query={"text":"omnis","target":"Dataset"}
maxCount=300

http://localhost:8080/icat/lucene/data?
sessionId=...
query={"text":"omnis","target":"Datafile"}
maxCount=300

followed by several entityManager queries:

select investigation from Investigation investigation 
where investigation.id in (5, 16, 22, 29, 37, 65) 
include investigation.investigationInstruments.instrument

select datafile from Datafile datafile 
where datafile.id in (796, 5250, 9342, ..., 1625) 
include datafile.dataset.investigation

select datafile from Datafile datafile 
where datafile.id in (1745, 1760, 1837, ..., 4710) 
include datafile.dataset.investigation

select datafile from Datafile datafile 
where datafile.id in (4711, 4715, 4720, ..., 7565) 
include datafile.dataset.investigation

In this case, it appears that there were no matching Datasets, and sufficiently many matching Datafiles to cause Topcat to split the request into chunks.

ExternalGridFilters

Recall that at the end of generateQueryBuilder() in browse-entities.controller and my-data.controller, there is code that calls externalGridFilter.modifyQuery() to the query constructed so far. The only reference to externalGridFilters in the rest of the code is in tc-ui.service:

    this.registerExternalGridFilter = helpers.overload({
        'array, object': function(states, options){
            _.each(states, function(state){
                externalGridFilters[state] = externalGridFilters[state] || [];
                externalGridFilters[state].push({
                    template: options.template || '',
                    setup: options.setup || function(){},
                    modifyQuery: options.modifyQuery || function(){}
                });
            });
        }
    });

Topcat itself does not use registerExternalGridFilters(). I can see that the IJP plugin uses this to add extra .where() clauses to the query, e.g. to restrict results to the latest version of each dataset.

I observe that the IJP plugin also uses tc.icat(facilityName).queryBuilder()! So plugins can create entirely new queries as well as modify existing ones. This could complicate matters quite a bit; but it may not be worth worrying about it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topcat's Queries

Topcat's Queries

Direct entityManager references

icat-query-builder.service

References to icat.query() and icat.queryBuilder()

breadcrumb.controller

Example queries

browse-entities.controller

Example queries

browse-facilities.controller

doi-redirect.controller

meta-panel.controller

Example meta-panel queries

my-data.controller

search-parameter.controller

tc-icat-entity.service

tc-icat.service

tc-user-cart.service

tc.service

search-results.controller

Example Search query requests

ExternalGridFilters

Clone this wiki locally