Introduction
Sensefuel Max enables the automatic integration of editorial contents within the search results to inspire and guide customers.
What is editorial content?
Functionally it is a page in your website or blog where there is at least:
- A title
- An image
- A description of the topic
Several types of editorial content can be useful to be integrated into the search results. Here are some examples:
- recipes
- buying guides
- tutorials
- product comparisons
- news
- etc.
Few examples from our clients :
- recipe : https://www.coffee-spirit.maxicoffee.com/recette/cafe-viennois-la-recette-facile-et-delicieuse/
- tutorial : https://woolschool.happywool.com/tricot-la-maille-endroit
- buying guide : https://www.provost.fr/fr/guides/5-choisir-son-rack-palettes
- product comparison : https://www.coffee-spirit.maxicoffee.com/guide/les-meilleures-machines-a-cafe-capsules/
Editorial contents retrieval methods
Sensefuel can retrieve your editorial contents based on several technical methods.
There are 2 main steps in this process:
- Retrieve the list of URLs of all editorial contents to crawl
- Crawl each editorial content to retrieve the required data
Retrieve the list of URLs of all editorial contents to crawl
There are 2 main methods to obtain that list of URLs
- Using an XML sitemap of your contents
Recommended method because the cost of development is lower, the method is reliable and sustainable
- Developing a crawler on one or several HTML content page lists (based on the HTML DOM).
As the HTML of pages is more subject to changes the crawler could be obsolete et would require a new development.
In both cases this needs to be discussed with your Customer Success Manager to determine the methodology, and eventual additional rules to implement.
XML content sitemap:
HTML example:
As described above Sensefuel can obtain that list of URLs of your editorial contents in a page list based on HTML elements.
In the example below Sensefuel will analyze the page and retrieve the list of your editorial contents’ URLs based on the HTML tag <a> found within the tag <article class="PostCard">.
Editorial content crawler
For each editorial content, Sensefuel can crawl several types of data, including structured data. The best structured data type for the crawler is JSON LD. (https://www.infoworld.com/article/2335036/intro-to-json-ld-json-for-the-semantic-web.html)
When using JSON LD, the minimum required data to retrieve for editorial content is:
| Property | Type | Description |
| type | texte | Content type |
| description | texte | Content description |
| name | texte | Content name |
| image | URL | Content image |
Additional properties and their data can be retrieved depending on the type of content to better describe the editorial content.
The list of these properties is available on schema.org which as micro-data schema used in the Web. These micro-data are used by search engines indexation bots to better capture and understand the sense of your contents and improve your SEO.
Sensefuel can also crawl other types of structured data within HTML tags such as “itemprop”, “itemtype” or “RDFa”, using HTML “property” and “typeof” to complete or substitute micro-data from JSON LD tag of your contents.
Additionally, Sensefuel can retrieve data from any HTML element of your content to better classify it (generate filter in the search), better understand it. Obviously, this type of data crawling is less recommended for the same reasons explained at the beginning of paragraph 3.1.
Recipe example (https://schema.org/Recipe):
Guide example (https://schema.org/Guide ) :
Shop example (source : https://schema.org/LocalBusiness )
If you want to know more about micro-data and web semantic follow this link https://developers.google.com/search