Watson Marketing Ideas

Submit new product ideas for Digital Analytics, Tealeaf, Universal Behavior Exchange, Watson Customer Experience Analytics, Watson Marketing Insights and Watson Content Hub solutions. Before you submit, please review existing ideas; if an idea close to yours already exists, it's better to add comments or vote on the existing idea. We will review your ideas and use them to help prioritize our product development. Best of all, the portal will automatically update you when the status of your idea has been changed.

Submit ideas for other Watson Customer Engagement Products:

•  Watson Campaign Automation
Watson Commerce
Watson Supply Chain

Better control over canister session indexing is needed, to reduce processing and storage cost

Tealeaf session indexing can be very costly, with index storage size commonly being twice session (LSSN) storage size, and sometimes escalating to far more. More control is required to skip indexing on content that is not meaningful for session search.

Tealeaf currently provides a mechanism for adding content types to index (i.e., "Additional Content Types to Index"), but the opposite type of control is also needed.

These options are requested, for identifying hits to skip:

  • "Content Types to Skip"
  • "File Extensions to Skip"
  • "URLs so skip" (where the URL contains any of the specified strings)

A workaround has been deployed from time to time... to "munge" the content type and have the indexer skip the hit, but this sometimes interferes with replay.

For reference, these content types are indexed by default:

    text/html
    text/plain
    text/xml
    application/xhtml+xml
    application/rdf+xml
    application/vnd.mozilla.xul+xml
    application/xml

 A hit designated to to be "skipped" for indexing would disregard these defaults, and not be indexed.

When a hit is "skipped" for indexing, standard request indexing would still be done, but the response and any JSON or XML payload in the request would not be indexed. This will help materially in AJAX heavy applications, where those data elements are not expected to be searched for directly. 

Additionally, if the DTSearch engine supports compression, and it is not already in use, enabling compression might reduce storage size. However, uncompressed storage has a storage advantage when the media supports de-duplication, so compression should also be disable-able.

  • Avatar32.5fb70cce7410889e661286fd7f1897de Guest
  • Jul 30 2018
  • Planned for Future Release
How will this idea be used?

Reducing the storage cost for Tealeaf improves it's value proposition, and helps with scalability. Additionally, reducing the indexing workload avoids indexing backlogs, and makes completed session searchable earlier.

What is your industry? Telecommunications
What is the idea priority? Medium
DeveloperWorks ID DW_ID0
RTC ID
Link to original RFE
  • Attach files
  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 3, 2018 22:05

    Recommend linking this with the Flash Storage encryption toggle Idea/Request.  As if we can optionally turn off software encryption, we can use newer hardware-based Flash storage systems that do all three things: Hardware supported Encryption, Compression (15% gain) and Data-Deduplication (up to 7x gain).
    https://watsonmarketing.ideas.aha.io/ideas/TLONPREM-I-12

  • Admin
    ROB HAIN commented
    October 16, 2018 14:11

    Questions:

    • Can "Content Types" and "File Extensions" be rolled together into "Content Types?"  Is there a compelling reason to keep these as discrete options?

    • Do these settings need to be domain or sub-domain specific?  Are there times when it is not desirable to skip content types / URLs across all domains?

    Best regards,

    Rob Hain
    IBM

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    October 16, 2018 21:08

    Regarding content type v.s also having file extension control, that's likely not a very important feature. Not all URLs have a file extension, but when they do it could be helpful when multiple file extensions have the same content type.  The most helpful setting would be a "content types to skip", as it is a natural match for the existing "Additional Content Types to Index" setting.

    Regarding having domain or sub-domain specific control, I don't think the extra complexity is likely to be needed.

    Where important items need indexing but are skipped (or are not even indexed today), there are other ways make them find-able. For something like an error code or message, copying those into [appdata] via a ReqSet and providing a custom search field is a good approach... and much better than finding them via a generic "all text" search. The overall goal is to reduce indexing processing and storage cost, which is material concern at some sites.