Online Presence: Best Practices

Decisioning patterns and example workflows for using Online Presence data in your onboarding process.

Introduction

Online presence signals are most valuable when applied with context. A missing website means something different for a 20-year-old concrete contractor than for a company claiming to be an established e-commerce platform. A low review volume is expected for a B2B firm and suspicious for a restaurant.

This guide covers how to use each sub-product effectively, how to combine them into a coherent decisioning workflow, and how to detect fraud patterns observed in the field. It also covers sector-specific patterns. For field definitions and response shapes, see Online Presence: Basics.

Working examples throughout this guide. We use two real Web Presence responses side by side:

  • Lucali: a well-known Brooklyn pizza restaurant. Clean, established, legitimate. Submitted website lucali.com and email [email protected] — both on the corporate domain, both consistent with the discovered identity.
  • 🚩 Hartwell Legal Group: a law firm application where the submitted website is a lookalike domain registered 15 days before the application, impersonating an established firm with a 4-year-old domain. Every fraud signal in this guide fires on this example.

Website Analysis

What to look for

Is the site real and operational? Use website_build_status and parked together. A status of coming_soon or inactive, or parked: true, indicates a placeholder or undeveloped domain. Treat these as early-stage signals, not automatic flags - but weight them heavily if the business claims years of operation. Note that many legitimate SMBs (local contractors, restaurants, service businesses) operate with no website at all, relying instead on Google Business, Yelp, or Facebook. A missing website should be informational, not a red flag. Evaluate it alongside the rest of the digital footprint: strong review presence and consistent contact data across platforms can more than compensate.

Is the domain legitimate? Flag if ssl_validity.is_valid is false. An invalid or missing SSL certificate is a meaningful risk signal, especially for businesses claiming to transact online.

Does the contact information align? Baselayer discovers emails and phone numbers from the website. Cross-check website_analysis.emails[] and phone_numbers[] against the submitted application data. Leverage email_match and phone_number_match fields. A business with email_deliverable: true and a professional domain on the discovered website, but a free email or an email on a different domain in the application, is a synthetic identity or impersonation signal worth flagging.

Does the submitted website match the discovered one? Check business_website_match. A value of false means the domain you were given doesn't match what Baselayer found. Compare the two objects directly - divergence in parked, website_build_status, or whois_record.domain_age_months between submitted and discovered domains is itself a risk signal.

How substantial is the website? website_structure_metrics returns depth (maximum link depth: "0", "1", or "2+") and breadth (total unique pages discovered). A real operating business almost always has some structure. A site with depth: "0" and breadth: "1" is not automatically suspicious, but combined with other weak signals it suggests a site put up quickly rather than built over time.

Key thresholds

SignalThresholdAction
website_build_status≠ active or parked: trueFlag — site may be placeholder or inactive
ssl_validity.is_validfalseFlag — insecure or spoofed site
whois_record.domain_age_months< 6 and business claims maturityFlag — newly registered domain for an older business
business_website_matchfalseFlag — submitted domain not verified
website_structure_metricsdepth: "0" and breadth: "1"Weak signal — evaluate alongside other indicators
Submitted email domain vs. discovered email_deliverable domainMismatch (e.g., free email or unrelated domain vs. deliverable corporate domain)Flag — potential impersonation or synthetic identity
Field✅ Lucali🚩 Hartwell Legal Group
website_build_statusactive (found); active (submitted)active (found); active (submitted)
parkedfalse (both)false (both)
ssl_validity.is_validtrue (both)true (submitted); null (found)
domain_age_months (found)21350 — consistent with firm founded 4 years ago
domain_age_months (submitted)213 — matches found domain0 — domain registered 15 days before application
business_website_matchtrue — submitted lucali.com matches found domainfalse — submitted domain does not match found domain
email_deliverable (found)truetrue
Submitted email domainlucali.com — matches found domainhartwelllegalgrp.com — different from found hartwelllegal.com

Domain Impersonation Patterns

This section covers fraud patterns that are only detectable by comparing the submitted domain against the independently discovered domain. They require input_website_analysis to be present, which means the applicant's website must be included in the request. Submit the website collected in your application form, or the email domain from the applicant.

Domain age vs. business age

The most reliable impersonation signal available. Legitimate businesses typically have domains as old as, or older than, the business itself. A business with years of history applying with a brand-new domain is a serious red flag.

The pattern in practice: A fraudster registers a lookalike domain days or weeks before submitting an application. The name is close enough to pass a casual read: a transposed letter, an added word, a slightly different TLD. They set up email on that domain to establish a communication channel with the FI, intercepting correspondence and redirecting funds while the real business owner never finds out.

Baselayer's website_analysis will return the legitimate, long-standing domain. input_website_analysis exposes the fraud by revealing that the submitted domain was registered days ago. Use whois_record.domain_created_at from both objects alongside months_in_business from the Business Search response.

ConditionAction
Submitted domain < 90 days old AND business > 24 months oldFlag for manual review
Submitted domain < 30 days old AND business > 24 months oldHigh-risk — escalate immediately
Found domain age roughly consistent with business agePositive corroborating signal
from datetime import datetime, timezone
def check_domain_age_risk(web_presence_response, months_in_business):
    input_analysis = web_presence_response.get("input_website_analysis")
    if not input_analysis or not months_in_business:
        return "INSUFFICIENT_DATA"
    whois = input_analysis.get("whois_record")
    if not whois or not whois.get("domain_created_at"):
        return "NO_WHOIS_DATA"  # Treat as elevated risk for established businesses
    created_at = datetime.fromisoformat(whois["domain_created_at"])
    domain_age_months = (datetime.now(timezone.utc) - created_at.replace(tzinfo=timezone.utc)).days / 30
    if domain_age_months < 1 and months_in_business > 24:
        return "HIGH_RISK"    # Domain registered days ago — near-certain impersonation
    elif domain_age_months < 3 and months_in_business > 24:
        return "HIGH_RISK"    # Sub-30 days — escalate immediately
    elif domain_age_months < 6 and months_in_business > 24:
        return "FLAG"         # Manual review required
    else:
        return "PASS"
Field✅ Lucali🚩 Hartwell Legal Group
input_website_analysisPresent — submitted lucali.comPresent — submitted hartwelllegalgrp.com
Submitted domain domain_created_at2008-08-22 — 213 months ago2026-05-09 — 15 days before application
Submitted domain domain_age_months213 — consistent with found domain0
Found domain domain_age_months213 — matches submitted domain50 — consistent with a firm founded ~4 years ago
check_domain_age_risk resultPASS — submitted and found domains alignHIGH_RISK

The Hartwell Legal Group response makes the fraud pattern unmistakable: a firm claiming years of operation, a found website with a 50-month-old domain, and a submitted website registered 15 days before the application on a domain that closely mirrors the real one (hartwelllegalgrp.com vs. hartwelllegal.com). Lucali, by contrast, shows the expected clean pattern — submitted and found domains are the same, ages align perfectly.

Lookalike domain names

Beyond domain age, inspect the submitted and found domain names directly for visual similarity. Common patterns:

  • Added or transposed charactershartwelllegalgrp.com vs. hartwelllegal.com
  • Hyphenationhartwell-legal.com vs. hartwelllegal.com
  • TLD substitution.net, .co, .org variants of a .com domain
  • Word additionshartwelllegalgroup.com, hartwelllegalservices.com When business_website_match: false and the two domain names are visually similar, treat this as a near-certain impersonation attempt regardless of other signals.

Free email vs. deliverable corporate domain

When the applicant submits an email address on a domain that differs from the discovered business domain — and the discovered domain has email_deliverable: true — that combination is a strong impersonation signal. If an established business has a working corporate domain with functioning email, there is no legitimate reason for a representative to apply using an email address on a different domain.

The most likely explanation: the real business exists, but the applicant does not work there. Note that this signal applies beyond free email providers. In the Hartwell Legal Group example, the submitted email is on the newly-registered lookalike domain (hartwelllegalgrp.com), not a free provider — but the mismatch against the discovered hartwelllegal.com domain is equally telling.

FREE_EMAIL_DOMAINS = {
    "gmail.com", "yahoo.com", "icloud.com", "hotmail.com",
    "outlook.com", "aol.com", "protonmail.com", "me.com"
}
def check_email_impersonation(submitted_email, found_website_url, website_analysis):
    if not submitted_email or not website_analysis:
        return "INSUFFICIENT_DATA"
    email_deliverable = website_analysis.get("email_deliverable", False)
    if not email_deliverable:
        return "PASS"  # Found domain not deliverable — signal not applicable
    submitted_domain = submitted_email.split("@")[-1].lower() if "@" in submitted_email else None
    if not submitted_domain:
        return "INSUFFICIENT_DATA"
    # Extract found domain from URL
    from urllib.parse import urlparse
    found_domain = urlparse(found_website_url).netloc.lower().lstrip("www.") if found_website_url else None
    if submitted_domain in FREE_EMAIL_DOMAINS:
        return "FLAG"   # Free email while corporate domain is operational
    if found_domain and submitted_domain != found_domain:
        return "FLAG"   # Email on a different domain than the discovered business domain
    return "PASS"
Field✅ Lucali🚩 Hartwell Legal Group
Submitted email[email protected] — corporate domain[email protected] — lookalike domain
Found domainlucali.comhartwelllegal.com
email_deliverable (found)truetrue
Email domain matches found domainYeslucali.com = lucali.comNo — submitted email is on the lookalike domain
check_email_impersonation resultPASSFLAG

Industry Prediction

What to look for

Set a confidence floor. accuracy ≥ 0.75 is the recommended threshold for automated decisioning. Predictions below this are typically based on thin or conflicting online data. Use them directionally (at the 2-digit or 4-digit NAICS level) but require manual review before acting on the 6-digit code.

Maintain prohibited industry lists at multiple levels. Use NAICS hierarchy strategically:

  • 2-digit codes to ban entire sectors (e.g., 71 = Arts, Entertainment, and Recreation)
  • 4-digit codes to target industry groups (e.g., 7132 = Gambling Industries)
  • 6-digit codes for precision targeting (e.g., 713210 = Casinos while allowing 713290 = Other Gambling Industries) Prohibited and restricted NAICS and keyword lists are available from your Baselayer account representative.

Scan keywords[] for sensitive terms. Even within permitted industries, certain keywords may indicate restricted activities — opioid risk in healthcare (pain, opioid), regulated substances (CBD, cannabis, vape), or potential TOSA risk (escort, adult). Maintaining a keyword watchlist catches high-risk niches inside approved sectors. A complete recommended keyword list is available from your account representative.

Use mcc_codes[] for card network compliance. For payment processing use cases, check mastercard_risk and visa_risk_tier inside each entry of mcc_codes[]. If any MCC has mastercard_risk: true or visa_risk_tier: "1", apply the relevant network compliance controls. Note that a single industry prediction may return multiple MCC codes — check all of them.

Note: Baselayer uses 2017 NAICS codes across all industry prediction and classification endpoints. The 2017 revision is the basis for all returned codes, filters, and industry-related fields. If you're cross-referencing against another system, confirm it also uses the 2017 standard.

Tiered review policy

AccuracyTreatment
≥ 0.75Safe for automated decisioning. Apply prohibited industry checks and network risk indicators automatically.
0.50 – 0.74Use directionally. Flag for analyst review. Apply prohibited checks at 2-digit or 4-digit NAICS level only.
< 0.50Do not use for automated decisions. Request additional information about business activities.
Field✅ Lucali🚩 Hartwell Legal Group
accuracy0.950.98
code722511 — Full-Service Restaurants541110 — Offices of Lawyers
keywordsrestaurant, pizzeria, pizza, calzone, dine-in, table servicelaw firm, attorney, legal representation, civil litigation, family law
risk_levellowlow
mastercard_riskfalsefalse

Note: industry prediction is clean on both.


Social Profiles

Social profiles are returned at business.social_profiles[] on POST /searches (requires Order.Enhanced) and at found_social_profiles[] on POST /web_presence_requests (requires Order.SocialMedia). The object structure is identical.

What to look for

Filter by confidence first. Only profiles with confidence: high should influence automated decisions. Medium and low confidence profiles may belong to unrelated businesses with similar names.

Verify submitted social profiles. When you submit known social profiles in the request, Baselayer returns a social_profiles_match[] array. Treat this as an additional confidence signal alongside email_match and phone_number_match: a confirmed match is a positive identity signal that corroborates the applicant's submitted data. A submitted profile that was not confirmed is informational rather than diagnostic, but warrants closer review. social_profiles_match[] is available on POST /web_presence_requests.

Cross-reference contact data across platforms. Compare business_website and email fields across discovered profiles and the submitted application. Inconsistencies between platform data and application data — especially different websites or unrelated email domains — can indicate impersonation.

Look for business-oriented signals. is_business_account: true on Instagram and is_business_page: true on Facebook are strong legitimacy indicators. is_private: true on Instagram is low-confidence and should not contribute to approval decisions.

Weigh by sector. High follower counts and review presence matter more for consumer-facing brands than B2B firms. For B2B, LinkedIn is typically the most relevant platform; for retail and hospitality, Instagram and Facebook carry more weight. Adjust thresholds accordingly.

Check found_on[]. Profiles with found_on: ["FOUND_WEBSITE"] were linked directly to the business's website, a stronger signal of ownership than profiles discovered through general search alone.

Field✅ Lucali🚩 Hartwell Legal Group
Profiles foundInstagram (279K followers), Facebook (13K followers)Facebook (7 followers, found_on: ["FOUND_WEBSITE"])
is_business_pagetrue (both)true
business_website in profilelucali.comhartwelllegal.comnot the submitted domain
found_on["FOUND_WEBSITE"]["FOUND_WEBSITE"]

The Hartwell example shows an important corroborating signal: the Facebook page linked from the found website (hartwelllegal.com) points back to hartwelllegal.com, not to the submitted hartwelllegalgrp.com. This cross-platform consistency of the real business creates a consistent picture that directly contradicts the submitted application.


Reviews

Reviews are returned at business.reviews[] on POST /searches (requires Order.Enhanced) and at found_reviews[] on POST /web_presence_requests (requires Order.ReviewSummary or Order.ReviewFull). The object structure is identical.

What to look for

Check Google open_state for operational status. The metadata.open_state field is populated only for Google reviews (source: "google") and reflects the current status of the business as reported by Google Maps. This is one of the most direct operational signals available. Critical values:

  • "Permanently closed": the business has been marked closed on Google. A major red flag for any application; escalate regardless of other signals.
  • "Temporarily closed": the business is not currently operating. Worth clarifying with the applicant.
  • "Closed · Opens [time]": outside operating hours, but active. Normal.
  • "Open": currently open. Positive signal.
def check_google_open_state(found_reviews):
    for review in found_reviews or []:
        if review.get("source") == "google":
            metadata = review.get("metadata") or {}
            open_state = (metadata.get("open_state") or "").lower()
            if "permanently closed" in open_state:
                return "HIGH_RISK"
            elif "temporarily closed" in open_state:
                return "FLAG"
    return "PASS"

Combine rating and volume. A high rating with low volume (< 10) is weak evidence. A low rating with meaningful volume (> 20) is a strong negative signal. Use both together.

Only trust confidence: high for automation. Medium and low confidence review profiles may belong to an unrelated business. Filter before scoring.

Cross-verify identity. Check address, phone_number, and business_website across review platforms and your submitted application data. In impersonation cases, review data will consistently point to the real business — not the submitted application details.

Apply sector awareness. Consumer-facing businesses (retail, hospitality, healthcare, food service) typically have substantial review presence. B2B firms often have few or no reviews — absence is not a risk signal for these sectors.

Field✅ Lucali🚩 Hartwell Legal Group
Google open_state"Closed · Opens 5 PM" — dinner-only restaurant, normal"Closed · Opens 9 AM Mon" — professional office, normal
Google review volume3,10026 — low but appropriate for a small law firm
Google rating4.24.5
business_website in reviewlucali.comstantonporter.com — the real domain, not the submitted one
phone_number in reviewMatches applicationDoes not match submitted phone number

In the Hartwell example, the review data independently confirms the real business identity: phone number and website in the Google listing point to the legitimate firm, directly contradicting the submitted application.


Directory Listings

Directory listings require Order.DirectoryListing to be included in your request options.

What to look for

Use people[] for officer cross-referencing. Directory listings often include officers or principals. Cross-reference these against submitted officer_names and business.business_officers[] from the KYB search. An officer appearing consistently across directories, the website, and SoS records is a strong identity signal.

Cross-reference contact details against the application. As with reviews, directory listings independently record the business's contact information. In impersonation cases, directory data will point to the real business — phone numbers, addresses, and websites that differ from what was submitted.

Use listing count directionally. There is no hard threshold, but a business appearing in multiple independent directories signals an established real-world presence. Zero directory listings for a claimed 5+ year operation is worth noting, particularly for local service businesses, contractors, and professional services.

Weight directory presence by sector. Local service businesses, contractors, and professional services firms typically appear in directories. Digital-native or B2B businesses may have fewer listings — absence is not a red flag for these sectors.

Field✅ Lucali🚩 Hartwell Legal Group
Listings foundn/a1 — legal directory listing
categoryn/a"Law Firm" — consistent with 541110
business_website in listingn/astantonporter.com — the real domain
phone_number in listingn/aDoes not match submitted phone number

Again, the directory independently confirms the real business identity — and contradicts the submitted application.


Putting It All Together

Web presence signals work best as a layered system. Each sub-product adds a different dimension:

  • Website Analysis answers: is this business real and operational online?
  • Industry Prediction answers: is this business in a permitted sector, and how confident are we?
  • Social Profiles answers: does this business have a consistent public identity across platforms?
  • Reviews answers: do real customers interact with this business, and is their experience consistent with the application?
  • Directory Listings answers: is this business recognized by third-party sources, and does the data they hold align?

**No single signal should drive a decision alone. **

The Hartwell Legal Group example illustrates this precisely: industry prediction is clean (0.98 accuracy, low risk, no prohibited keywords), the found website is active and legitimate, and the business has positive reviews. The fraud only becomes visible when you compare the submitted domain against the discovered one - and then cross-reference contact data across reviews and directories to see the contradiction.

Lucali, by contrast, passes every check cleanly: submitted domain matches the found domain, domain age is consistent with business age, and the submitted email is on the same corporate domain.

The most reliable approach:

  1. Start with website legitimacy. website_build_status, parked, ssl_validity.is_valid, and domain age are fast checks that catch the most obvious cases.
  2. Run impersonation checks if a website was submitted. Domain age vs. business age and email domain mismatch catch the fraud patterns that website legitimacy checks alone miss. Leverage fields like email_match or phone_number_match to cross-reference the submitted information. These are the highest-value checks in this guide.
  3. Classify with industry prediction. A confidence score and prohibited-list check against industry_prediction.code and keywords[] does the heavy lifting for compliance screening.
  4. Add social and review context where relevant. Sector-appropriate, confidence-filtered, cross-referenced against application data.
  5. Use directory listings to corroborate. Particularly valuable for local businesses, contractors, and sole proprietors. In impersonation cases, directory data will independently confirm the real business identity.

Example Workflow

This workflow is a starting template. Tune thresholds and logic based on your product type, risk appetite, and customer base.

Step 1: Website legitimacy

CheckLogicSignal
Website operationalwebsite_build_status ≠ active or parked: trueℹ️ Informational — evaluate alongside review and social presence
No website foundfound_website is nullℹ️ Informational — common for SMBs; weight other signals more heavily
SSL validityssl_validity.is_valid ≠ true🚩 Flag — may indicate spoofed or insecure site
Domain age vs. business age (submitted)domain_age_months < 1 and months_in_business > 24🚨 High-risk — escalate immediately
Domain age vs. business age (submitted)domain_age_months < 6 and months_in_business > 24🚩 Flag — new domain for an established business
Submitted website unverifiedbusiness_website_match: false🚩 Flag — submitted domain not confirmed
Email domain mismatchFound domain has email_deliverable: true; submitted email is on a free provider or different domain🚩 Flag — potential impersonation
Address confirmed onlinebusiness_address_match_sources includes FOUND_WEBSITE, REVIEW, or DIRECTORY✅ Positive identity signal

Step 2: Industry alignment

CheckLogicSignal
Low prediction confidenceaccuracy < 0.75🔍 Review — low confidence prediction
Prohibited industryindustry_prediction.code matches restricted NAICS list🚩 Flag — prohibited sector
Sensitive keywordskeywords[] includes terms from restricted keyword list🚩 Flag — potentially noncompliant operations
Card network riskAny mcc_codes[].mastercard_risk: true or mcc_codes[].visa_risk_tier: "1"🚩 Flag — network-restricted MCC

Step 3: Social profile verification (if relevant for your sector)

CheckLogicSignal
Confidence thresholdProfile confidence ≠ high⚪ Ignore for automation
Business accountis_verified: true or is_business_account: true✅ Positive legitimacy signal
Submitted profile verifiedsocial_profiles_match[] entry returns matched: true for a submitted profile✅ Positive identity signal
Cross-platform consistencybusiness_website in profile differs from submitted domain🚩 Flag — profiles point to a different domain than submitted

Step 4: Reviews (if relevant for your sector)

CheckLogicSignal
Google operational statusmetadata.open_state contains "permanently closed"🚨 High-risk — escalate immediately
Google operational statusmetadata.open_state contains "temporarily closed"🚩 Flag — investigate
Negative sentimentrating < 3.0 and volume > 20🚩 Flag — consistent negative feedback
Identity mismatchbusiness_website or phone_number in reviews differs from application data🚩 Flag — review data points to a different business identity

Step 5: Directory listings (if relevant for your sector)

CheckLogicSignal
Officer cross-referencepeople[] entries match submitted officer_names✅ Positive identity signal
Contact data mismatchbusiness_website or phone_number in listing differs from application data🚩 Flag — directory points to a different business identity

Decisioning summary

Approve when:

  • Website is active, secure, and consistent with application data
  • Submitted domain age is consistent with business age (or no website submitted)
  • Industry prediction accuracy ≥ 0.75, predicted code is permitted, no flagged keywords
  • No identity inconsistencies across sub-products

Review when:

  • No website found and no compensating signals (reviews, social profiles, or directory listings) exist
  • Website signals are mixed (e.g., active but recently registered)
  • Social or review data is absent in a sector where it would be expected

Flag when:

  • Submitted domain recently registered for a business claiming years of operation
  • Email domain mismatch suggesting impersonation or synthetic identity
  • business_website_match: false combined with domain age discrepancy or visual domain similarity
  • Review or directory data points to a different website or phone number than submitted
  • Industry code or keywords match a prohibited list
  • Any MCC has mastercard_risk: true or visa_risk_tier: "1"
  • Google open_state indicates "Permanently closed"
  • Consistent identity discrepancies across website, social profiles, reviews, and directories

Sector-specific guidance

SectorWhat matters most
Consumer retail / e-commerceActive website with SSL; domain age; impersonation checks; social presence (Instagram, Facebook); Google and Yelp reviews with meaningful volume
Restaurants / hospitalityReview volume and rating (Google, Yelp, TripAdvisor); Google open_state; Facebook check-ins; address consistency across platforms
B2B / professional servicesLinkedIn company page (confidence: high); website legitimacy; domain impersonation checks; directory listings; officer cross-reference
Contractors / local servicesDirectory presence (industry associations, chamber of commerce, licensing registries); website analysis; address match sources
HealthcareWebsite legitimacy; keyword scan for restricted terms (full list available from your account representative); license-based directory listings

Related guides