Online Presence & Orderables: Best Practices

Best practices for using Baselayer’s Orderables and Online Presence products

Baselayer’s Online Presence and Orderable products give you deeper insight into a business’s digital footprint and operating profile.

These tools can be requested independently or bundled directly into a Business Search to enrich KYB decisions with context such as website legitimacy, online activity, social presence, and predicted industry classification.

The 3 available products today are:

  • Website Analysis
  • Industry Prediction
  • Social Media & Reviews (via options: Enhanced)

This guide covers how to interpret the results returned by each of the products, and basic recommendations about how to use them within an onboarding workflow.

For a deeper dive into how to request and retrieve these analyses, and the different paths available, please read Online Presence: Basics.




Website Analysis Fields (domain legitimacy & contact discovery)

Website Analysis provides domain-level intelligence to confirm a business’s legitimacy and online presence.

Key data points include:

  • business_website_match → Returned in the Business Search. Whether the submitted website matches Baselayer’s records (true, false, or null).
  • parked → Whether the domain is undeveloped or inactive.
  • email_deliverable → Whether emails from this domain are deliverable.
  • domain_age_months → Age of the domain, very useful when compared with months in business.
  • website_build_status → Enum describing the overall operational status of the website: coming_soon, inactive, or active. Helps identify placeholder or non-operational pages.
  • ssl_validity.is_valid → Boolean showing whether the website’s SSL certificate is valid and active. A key indicator of site authenticity and security hygiene.
  • website_summary → A short text summary automatically generated from the website’s homepage, describing its content and focus. Useful for quick screening and keyword-based rules.

Best practices

Look for website completeness: use website_build_status and parked together to quickly assess whether the site is live. Treat coming_soon or inactive websites as early-stage or possibly non-operational.

Evaluate domain legitimacy: flag domains with ssl_validity.is_valid ≠ true as high risk. Use domain_age_months to detect newly registered sites, especially if the business claims years of history.

Compare domain age vs. business age: use domain_age_months to detect newly registered sites, especially if the business claims years of history.a large mismatch (e.g., a 5-year-old business with a 1-month-old domain) can indicate risk.

Verify contact consistency: cross-check discovered emails and phone numbers with the application data. Baselayer discovering a domain where email_deliverable = true, and the application email being a gmail or outlook email, can indicate impersonation or synthetic identity risk.




Industry Prediction Fields (business classification & risk)

Accurate industry classification is central to risk management, network compliance, and underwriting.
Baselayer’s Industry Prediction product analyzes multiple signals, including web data, to return standard codes and risk indicators.

Key data points include:

  • naics_code → 6-digit North American Industry Classification System code.
  • mcc_code → Merchant Category Code.
  • sic_code → Standard Industry Code.
  • accuracy → Baselayer’s confidence level (0–1 scale) on the prediction.
  • keywords[] → 4–8 keywords representing core business activity.
  • risk_level → Baselayer’s normalized risk tier (low, medium, high).
  • mastercard_risk → Boolean showing if Mastercard considers the MCC high-risk.
  • visa_risk_tier → Visa’s risk tier classification (1–3, null).

Best practices

Establish prohibited industry lists: use NAICS at 2-digit (sector), 4-digit (group), or 6-digit (specific) levels to filter restricted categories according to your risk policy.

Monitor keywords for nuance: detect sensitive terms (e.g., “pain”, “vape”, “escort”) that may flag restricted activities even within approved industries. Baselayer can provide a list of standard keywords for review.

Set confidence thresholds: Baselayer recommends accuracy ≥ 0.75 as a strong reliability benchmark. Predictions with accuracy below 0.75 are usually based on thin online data and should be reviewed to confirm their accuracy.

Leverage network risk indicators: align mastercard_risk and visa_risk_tier with card network compliance rules.




Social Profiles (online identity footprint)

Social profiles give visibility into how a business presents itself online: from its brand activity to its customer engagement and operational transparency.

Baselayer automatically discovers and validates these profiles as part of Enhanced Search or Web Presence analyses.

Each social profile is represented as an object within the social_profiles[] array.

FieldDescription
sitePlatform identifier. Possible values: linked_in:company, linked_in:personal, twitter, x, facebook, instagram, youtube, tiktok, pinterest.
usernameThe handle or username of the profile.
urlDirect URL to the social profile.
confidenceBaselayer’s confidence level that the profile belongs to the business (high/medium/low).
metadataPlatform-specific attributes (see below).

Platform-specific metadata

While each platform shares similar fields (followers, contact information, etc.), their structures vary slightly.
Below are the main attributes to expect per platform and how to use them effectively.

Instagram

  • is_private
  • is_business_account
  • has_business_address
  • bio
  • followers_count
  • phone_number
  • email
  • business_website

Use it for: verifying that the profile is public and business-oriented.
Tip: profiles that are is_business_account = true and has_business_address = true are generally legitimate, while is_private = true profiles should be treated as low-confidence.

LinkedIn

  • company_size_range
  • industry
  • followers_count
  • number_of_employees
  • phone_number
  • email
  • business_website

Use it for: estimating company scale and validating professional legitimacy.
Tip: LinkedIn company pages are often the most reliable business profiles.
The company_size_range and number_of_employees fields can be compared to application data or risk thresholds.

Facebook

  • is_business_page
  • has_reviews
  • check_ins_count
  • followers_count
  • phone_number
  • email
  • business_website

Use it for: validating local or consumer-facing businesses.
Tip: has_reviews = true and high check_ins_count values reflect customer engagement, especially useful for retail, restaurants, and service industries.

X (formerly Twitter)

  • is_verified
  • follower_count
  • bio
  • joined_date
  • phone_number
  • email
  • business_website

Use it for: assessing brand maturity and communication activity.
Tip: joined_date and is_verified = true are strong indicators of long-term presence and authenticity.

YouTube

is_verified, subscriber_count, channel_type, has_business_email, channel_description, followers_count, phone_number, email, business_website
Use it for: confirming active content creation and marketing activity.
Tip: Channels with is_verified = true or has_business_email = true typically belong to real businesses.

TikTok

  • is_verified
  • subscriber_count
  • channel_type
  • has_business_email
  • channel_description
  • followers_count
  • phone_number
  • email
  • business_website

Use it for: confirming active content creation and marketing activity.
Tip: Channels with is_verified = true or has_business_email = true usually belong to legitimate businesses.

Pinterest

  • follower_count
  • monthly_views
  • bio
  • has_business_website
  • phone_number
  • email
  • business_website

Use it for: identifying creative, e-commerce, or design-related businesses.
Tip: monthly_views and has_business_website = true indicate reach and engagement.

Best practices

Prioritize confidence and consistency: profiles with confidence = high and is_verified = true should be trusted as strong supporting signals.

Cross-reference domains and emails: match business_website and email fields across platforms and your submitted application data to spot inconsistencies or impersonation.

Look for breadth, not just presence: a legitimate business often has at least one professional network (LinkedIn) and one public-facing network (e.g., Instagram or Facebook).

Weigh social relevance by sector: high follower counts or reviews matter more for consumer brands than B2B firms. Specific platforms might only be relevant for niche products. Social media presence might not be relevant at all for certain sectors. Adjust thresholds accordingly.




Found Reviews (customer reputation insights)

The reviews[] array provides consolidated information about a business’s online reputation, summarizing reviews and ratings from public sources such as Google, Yelp, Trustpilot, TripAdvisor, and others.

Each object in this array represents one review source and includes key metadata that can help you assess credibility, customer satisfaction, and operational presence.

Fields in each review object

FieldDescription
sourceThe platform where reviews were found. Possible values: yelp, google, trustpilot, tripadvisor, other.
urlDirect URL to the review profile or business page on that platform.
confidenceBaselayer’s confidence that the review profile belongs to the business. Possible values: high, medium, low.
ratingAverage numeric rating (e.g., 4.5 for 4.5 out of 5).
volumeThe total number of reviews collected for that source.
summaryTextual summary or sentiment overview of the reviews (e.g., “Consistently positive service feedback from verified customers”).
phone_numberBusiness phone number found in the review profile, if available.
addressBusiness address found in the review profile, if available.
business_websiteBusiness website associated with the review profile, if found.

How to use these fields

  • confidence → Helps you filter reliable matches. For example, only consider reviews with confidence = high for automated scoring.
  • rating + volume → The simplest way to gauge satisfaction. A high rating with a large review volume indicates strong operational presence.
  • summary → Useful for UI displays or sentiment-based scoring.
  • phone_number, address, business_website → Allow you to cross-check consistency between review platforms, the applicant’s inputs, and Baselayer’s other data (e.g., business search or website analysis).

Best practices

Set minimum volume and confidence thresholds: very low volume (e.g., <10 reviews) might not be representative, especially for large or established companies. confidence = medium can be unrelated to the business profile being analyzed.

Leverage sentiment context: use summary to enrich internal case review tools. It provides a human-readable assessment of customer perception.

Cross-verify business identity: match address, phone_number, and business_website across review platforms and Baselayer’s other products to detect impersonation or franchise mismatches.

Apply sector awareness: review visibility varies by industry. For example, consumer-facing companies (retail, hospitality, healthcare) will typically have higher review volumes than B2B firms.




Putting It All Together

Web Presence & Orderables turn Baselayer into a digital risk lens you can layer onto KYB. Here’s a tight, end-to-end way to use them without overthinking.

  1. Pick the entry point:
    • Onboarding / decisioning: add orderables to POST /searches via options (e.g., "Order.WebsiteAnalysis", "Order.NaicsPrediction", "Order.EnhancedSearch").
      Expect orderables\[] to include one object per option (Website Analysis may return two if the submitted site differs from the discovered site).
    • Pre-screening / monitoring / enrichment: use POST /web_presence_requests with the same payload shape (name, address, and optional fields).
  2. Send the right inputs
    • Required: name, address (state minimum; more detail = cleaner matches).
    • Optional: alternative_names or website are particularly helpful.
    • Use Order.EnhancedSearch in options when additional discovery matters.
    • Use reference_id to tie responses and webhooks to your internal case.
  3. Read Website Analysis first (is the site real and operating?)
    • Flag if: website_build_status ≠ active, parked = true, ssl_validity.is_valid ≠ true, email_deliverable = false, or domain_age_months < 6 and the business claims maturity.
    • Review if contacts on site do not match the application (e.g., Gmail on app vs. deliverable corporate domain on site).
  4. Classify with Industry Prediction (and align to policy)
    • Set a confidence floor (e.g., accuracy ≥ 0.75 for auto-use; otherwise review).
    • Maintain prohibited lists at 2-digit / 4-digit / 6-digit NAICS granularity.
    • Scan keywords\[] for sensitive terms to catch risky niches inside permitted industries.
    • Respect card network indicators (mastercard_risk, visa_risk_tier) in payments use cases.
  5. Add texture with Social Profiles (if relevant for the industry)
    • Require confidence = high for the profile to influence decisions.
    • Cross-check business_website ir email across platforms and your application to spot impersonation.
    • Review misalignments between follower_count and maturity or size of the business.
  6. Check Found Reviews for real-world sentiment (if relevant for the industry)
    • Trust only confidence = high for automation.
    • Use rating + volume together (e.g., low rating with meaningful volume → review).
    • Reconcile address / phone_number / business_website with your app data.
  7. Enrich your internal profiles, and superpower your analysts with information automatically
  8. Automate decisions only if relevant for your product and customer base

In short:
Start with the entry point (Business Search vs. Web Presence), run the right options, and make decisions with a short set of clear, testable rules. Website authenticity + industry confidence do most of the heavy lifting; social and reviews add fast, human-readable context.




Example Workflow (balanced rules to kick off evaluation)

Web Presence checks are optional - not every platform or customer needs to run them.

But for teams looking to incorporate digital signals into their onboarding, this workflow provides a balanced starting point that mirrors how Baselayer’s customers evaluate a business’s online footprint.

Why run this workflow

  • Discover more: Identify the real operating website, and confirm it’s active and legitimate.
  • Detect inconsistencies: Catch mismatches between what applicants submit and what exists online.
  • Enhance confidence: Use social and review signals to strengthen decisions for digital-first or early-stage businesses.

Step 1: Website legitimacy

CheckLogicAction
Website operational statuswebsite_build_status ≠ "active" or parked = trueFlag - possibly inactive or placeholder site
SSL validityssl_validity.is_valid ≠ trueFlag - may indicate spoofed or insecure site
Submitted email domain mismatchIf Baselayer finds a website and email_deliverable = true, but the email in the application uses a different domain (e.g., Gmail, Outlook or unrelated domain)Flag - potential synthetic identity or impersonation risk
Website ageIf domain_age_months < 6 and months_in_business (from the business search response) > 24Flag - possible impersonation or newly created site for an older company
Unmatched websiteIf business_website_match = falseFlag - unverified submitted website
Submitted addressIf business_address_match ≠ NO_MATCHPositive signal of identity consistency

Step 2: Industry alignment

CheckLogicAction
Prediction confidenceIf accuracy < 0.70Review - low confidence prediction
Prohibited industriesIf naics_code matches restricted listFlag - prohibited sector
Keyword reviewIf keywords[] includes words in the restricted listFlag - sensitive or noncompliant operations
Card network indicatorsIf mastercard_risk = true or visa_risk_tier = 1Flag - network-restricted MCC

Step 3: Social profile verification

CheckLogicAction
Confidence thresholdOnly consider profiles where confidence = highIgnore others for automation
Verification or business accountIf is_verified = true or is_business_account = truePositive signal of legitimacy
Cross-platform consistencyCompare business_website and email across platforms and input dataFlag - discrepancies between platform data and application info

Step 4: Review sentiment (if relevant for your sector)

CheckLogicAction
Confidence thresholdOnly consider profiles where confidence = highIgnore others for automation
RatingIf rating < 3.0 and volume > 20Flag - consistent negative sentiment
Address or phone mismatchIf address, business_website or phone_number in reviews differs from submitted infoFlag - possible impersonation or multiple franchise entities

Step 5: Decisioning summary

Approve when:

  • Website is active and secure
  • Industry prediction confidence ≥ 0.75 and risk level acceptable
  • No flagged content, mismatched emails, or high-risk keywords

Flag when:

  • Domain inactive, insecure, or unrelated to declared business
  • Baselayer identifies signs of potential impersonation or synthetic identity risk
  • Keywords or NAICS codes fall into prohibited lists
  • Baselayer discovers clear inconsistencies (e.g., different operating address, unrelated website, or contact details)
  • Review data shows mixed or unverified sentiment

Summary

This workflow is a starting template, not a rulebook. Users should tune thresholds and logic based on product type, risk appetite, and geography.

For most, the email domain mismatch check is the single highest-value flag to catch impersonation, followed by domain age for detecting synthetic or fraudulent entities.