How Petal Search Works
Petal Search is a search engine designed to show the most relevant information to our users based on their searches, taking into account the web pages, websites and apps that our users find most relevant based on their interests.
The below information is intended to help you understand how Petal Search finds, indexes, ranks your websites and mobile applications, but also to understand how you can improve your rankings.
1. Ranking Criteria
2. How Petal Search finds and crawls webpages
3. How to improve your rankings
4. Bad practice examples
5. Safe Search
1. Ranking Criteria- Key factors for Petal Search page ranking
Given a query, Petal Search uses an algorithm to match the query with the content in the index and produce a ranking list of web pages result. Petal Search is constantly improving the existing algorithms and designing new ones to improve both the effectiveness and the efficiency of the search engine.
Petal Search uses many factors to rank pages, and some of the main ones are listed as below. In the ranking process, the relative importance of these factors may change depending on query and time.
Relevance: Relevance measures how a query is related to the content. It is noted that Petal Search considers not only how query match with the content, but also the semantic similarity between the query and content. For example, synonyms or abbreviations will be considered to have the same meaning with a term, which are also considered in the matching process.
Quality and Credibility: The integrity and richness of the web page, the reputation of the author, the credibility of the source and the website, the available description, screenshots and links, the availability of the web page and website, the page loading time, etc. are all considered in the ranking process of Petal Search. These details should provide a good representation of your website or application, since its characteristics are improving the user decision-making experience.
Also, when rating of your website or mainly, applicable for applications, the rating of the website/application based on the users’ ratings and reviews is considered in identifying the quality of your website.
Location: Petal Search considers the users’ search context such as where a search is happened (country and city), the language of the query, the location of where the webpage is hosted is also considered, in order to show to our users the results that are the most relevant to them.
Timeliness: Petal Search considers whether a web page provides up-to-date information for a given query or not. It generally prefers web pages which provide fresh content, especially for time-sensitive queries.
User interaction: User interaction factors are also considered in the ranking process. For example, whether users click the search results for a given query, which results are clicked or skipped, how much time users spend on the search results, whether the user reformulates a query, etc.
[Even though the characteristics of the goods or services offered by your app, website or web page site do not affect how Petal Search ranks results, please have in mind that any illegal, harmful content is filtered out ]. As any other search engines, we take proactive measures to detect and remove illegal content online. What is illegal offline is also illegal online: incitement to terrorism, xenophobic and racist speech that publicly incites hatred and violence, as well as child sexual abuse material are illegal.
Petal Search results are produced without prioritizing Huawei’s products or services. However, Petal Search may provide a separate way, such as via banners or answer box outside of the search results, to promote a Huawei product or service that is related to search items.
2. How Petal Search finds and crawls webpages
In order to allow PetalBot to crawl more webpages, it is recommended to increase effective links and improve the quality of the webpage, and to avoid adding PetalBot to the list of prohibited crawlers in the robots.txt file.
Sitemap: Sitemap can give information to Petal Search on webpages, videos, pictures and other resources. The web pages and resources identified in the Sitemap can be crawled by Petal Search as important information and will be parsed as potential indexes. Since Petal Search crawls Sitemap regularly, we suggest that Sitemap should be updated in accordance with website changes, such as deleting expired or invalid links as well as other useless resources. We recommend using document in xml format so as to facilitate Petal Search in searching URLs and other resources. Petal Search can find the Sitemap of a website with the mark of path in the robots.txt file. For example, the guidelines of Sitemap http://www.example.com/Sitemap_URL.xml.
The guidelines for Sitemap are as below:
• Petal Search currently supports XML and text format files
• Please use <lastmod> attribute to identify the updated time of URLs.
• The maximum sitemap file size of 50MB. It is better to put URLs and other resources in several Sitemaps so that the size of Sitemap won’t surpass 50MB.
• It is recommended to describe resources in sitemap by standard URLs so that Petal Search can better crawl relevant webpages and resources.
Robots.txt: As a part of Roamer Exclusion Protocol, a robots.txt file is usually put in root directory of a website to tell PetalBot whether the webpages, documents are crawlable or not. It can avoid website overloading caused by PetalBot.
• A robots.txt file is usually put in root directory of a website such as http://www.example.com/robots.txt. Please do not store in other directory locations.
• If you want to prohibit PetalBot from crawling a web page, you can block PetalBot’s request to access that page. However, it will take time to removing the page from Petal Search results.
• Update the robots.txt file regularly, and make sure that webpages are prohibited from crawling are correct.
• It is recommended to prohibit crawls of invalid and low-quality URLs and directories.
Links: A link is a connection from a web page pointing to a destination, which can be a web page, the other location on the same web page, a picture, e-mail address, file, or even an application. A link identifies the popularity of a site or resources. If a link is often introduced by other sites, the probability that the site will be included by Petal Search will be increased. High quality and popular web content is easier for other sites to introduce, which can be identified and collected by Petal Search in time. PetalBot will crawl links on the site or introduced from other sites in accordance with the site’s robots protocol.
• It is recommended that link length should not be too long, and should not exceed 200 characters.
• It is recommended to encode URL in accordance with RFC3986 standard.
• It is recommended to control the number of links on each page, no more than 5000.
• It is recommended to delete invalid links from sitemaps in time.
Webpages: Improve the quality of webpages so that they can be identified and collected by Petal Search in time, as follows:
• It is recommended to remove duplicate pages.
• It is recommended to use the same URL on the pc and mobile devices.
• Configure the parameters of URLs to reduce redundant ones.
Redirect: If a webpage redirect to another URL, it is recommended to use 301 http return code to markup. If it is a temporary redirection, markup with 302 http return code is recommended.
It is recommended to use 404 http return code to markup a webpage has become invalid. If a web page is no longer valid, it will take some time to remove it from the search results by Petal Search.
3. How to improve your rankings
The quality of your webpage, website or app, including its title, description or photos:
• Create an accurate title for your page: It’s very important to create an accurate title for each page and place it in the <title> tag. The page title should be as descriptive and unique as possible. To ensure the accuracy of the title, please avoid invalid characters, garbled code, or any inconsistency between the title and content topic.
• Add a concise summary to your page: Please use <meta name=“description”> to add a concise summary to your page to help Petal Search better understand the page content. Description can also be used as part of the search results to provide users with better experience.
• Tag pages which should not be indexed: If you do not want certain pages to appear in Petal Search results, add <meta name=”robots” content=”noindex”> to the <head> tag of the page.
• Add publish time with time zone for article page, news page, etc.: The publish time of pages will affect result ranking. Use the < time > tag, meta metadata, etc. to add the publish time of the page. The time string should contain as much time zone information as possible, like 2020-06-28 11:00+0800. Please avoid time zone confusion, so as to prevent Petal Search misinterpreting the publish time of a web page which will affect the timeliness judgment of the page and its ranking.
• Web link extraction: Petal Search will extract the web link pointed to by the <a> tag on the page and crawl the content of the page corresponding to the link. You can add <meta name=“robots” content=“nofollow”> to the <head> tag, if you do not want Petal Search to follow all the links on the page, or you can add the value rel=”nofollow” to the <a> tag, if you do not want Petal Search to follow the web link pointed to by the <a> tag.
• Image link extraction: Use the <img> tag or <picture> tag to specify the image file to be displayed. Please try to use alt attributes for <img> to add text description so that Petal Search can better understand the image content. Also, including high-quality pictures of your items might help.
• Add structured data: Add structured data to the page so that Petal Search can better understand the information on the page and in the meantime display the content of the page in a more user-friendly way. It is recommended to adhere to the Schema.org specification and use format JSON-LD or Microdata to add the structured data. For specification of Schema.org, please refer to https://schema.org.
Use the <H1>-<H6> tags to define headings and paragraph structure: It is convenient to help Petal Search understand the content of a page to identify headings and paragraph structure with the <h1>-<h6> tag.
4. Bad practice examples
To improve the technology and content of the site, developers use search engine optimization (SEO) techniques to make the site more relevant and accessible for search engine crawlers. As with other search engines, most SEO practices make websites more attractive to our engines. However, extreme practices such as the misuse of certain SEO techniques do not guarantee that ranking of the site will improve or search traffic will increase, but rather lead to some sort of punitive action by search engines. Generally, sites that abuse certain behaviors are considered and marked low-quality and not indexed. Some examples of abuse and things should be avoided are follows:
Malicious page: Pages that are phishing, carrying viruses, trojans, or other malware will be removed from the index and severe cases, and even duplicate websites will be deleted from the index.
Keyword stuffing: Deliberately over-weight certain words or similar words on a web page, hoping to obtain better search engine rankings; or introduce specific, intent-driven, hot words that are irrelevant to the content of the web page, hoping to be hit by the search engine when searching for such words. Those will be considered cheating and result in the site being removed from the index.
Piece up or automatically generate content: Collect content from other reputable websites by crawling or reprinting, lacking original content; content released by multiple sites cannot be combined, and readers cannot obtain the required information. In such cases, the related web page is suppressed or even removed from the index. There is also some machine-generated content, where the content is automatically generated by using various article generators, and a related web page will be suppressed or even removed from the index.
Cloaking: Cloaking is a cheating behavior of cheating search engine crawlers, which manifests itself in providing the crawler with content but the content that the user sees is unhealthy or illegal. It may result in the site being removed from the index.
Site Group: Site group is embodied as a number of artificially similar websites linking to the same website, by increasing the number of links pointing to a website to improve the search rankings, however, it does not represent the actual popularity of the site; or multiple similar websites linking to each other, getting a lot of traffic through search engines, however, they will fail to increase the number of high-quality links and may result in the sites being removed from the index.
Link Spam: Referencing or jumping to unhealthy, illegal, and malicious content pages, like jumping to phishing pages, will be considered cheating and result in the related pages being removed from the index.
Duplicate Content: Duplicate contents from multiple URLs and duplicate problems caused by too many parameters will be considered a low-value page, which may be filtered so as not to enter the index.
Instability: Websites with frequent changes will be considered that the stability is insufficient, which will affect user experience and will be suppressed during websites ranking. Developers should regularly maintain the stability of websites.
5. Safe Search
With ever increasing amounts of material on the internet, we are doing our best efforts to protect our users and especially children, the most vulnerable out of all our users. When our users are enabling safe search functions, SafeSearch prevents adults only and offensive content from showing in search results.
SafeSearch specifies whether search results can contain explicit content in texts, image and videos. Machine learning and deep learning algorithms are used to classify adult content. However, due to the current limitations of the algorithm capability, we cannot promise to filter out all pornographic content. We recommend you help Petal Search improved adult content filtering capability.
We will collect the adult content label from
• Using <meta name =”rating” content=”adult”>
• Grouping adult images, videos and other content in a special resource path: http://www.example.com/adult/image.png, http://www.example.com/adult/video
If you are a business user and you have a complaint about how we rank results, please refer to P2B complaints procedure