← Back to Blog Data Deep Dives

What's in the SEC Auto ABS Loan-Level Schema

The SEC has required auto ABS issuers to publish loan-level data since 2014. In the current ABS XML Technical Specification for auto loans, ABS v3.1 dated September 18, 2023, the auto-loan XSD defines 72 asset-level data elements. If you are searching for exact ABS-EE variable names like obligorCreditScore, paymentToIncomePercentage, vehicleManufacturerName, reportingPeriodActualEndBalanceAmount, currentDelinquencyStatus, or chargedoffPrincipalAmount, those are the names that appear in the SEC schema and sample XML.

The data exists. Getting it into a form you can actually use is another matter entirely.

That work — across the issuer shelves we track — took years to get right. The result is 45.8 million unique loans, 1.14 billion monthly performance rows, and roughly 74 billion individual data points, cleaned, resolved, and ready to query. LoanTape keeps the SEC variable names visible and also maps them into query-friendly aliases where that makes analysis easier, but this post uses the SEC names first.


What Regulation AB-II actually requires

The SEC's Regulation AB-II, finalized in 2014, was designed to fix a specific problem: investors in ABS couldn't see what they owned. They got pool-level summaries (weighted-average FICO, aggregate delinquency rate, monthly loss) but no visibility into individual loans or the composition of the collateral they were exposed to.

The regulation solved this with two new filing types for every monthly reporting period:

  • Form 10-D: covers the distribution period at the pool level (cash flows, credit enhancement triggers, collateral performance)
  • ABS-EE: the loan-level exhibit, filed alongside the 10-D, with one row per loan per reporting period

The ABS-EE is where the data lives. Every issuer on a public ABS shelf is required to file it monthly in XML format, to a standardized schema the SEC publishes. In the current SEC auto-loan XSD, that means 72 asset-level elements spanning origination data like obligorCreditScore, vehicle fields like vehicleManufacturerName, month-end performance fields like currentDelinquencyStatus and reportingPeriodActualEndBalanceAmount, cash fields like totalActualAmountPaid, and lifecycle fields like zeroBalanceCode.

The SEC's intent was transparency. What it produced, from a data engineering standpoint, was a distributed archive: hundreds of XML files per month, one per trust, across dozens of issuers, going back over a decade.


What "just downloading it from EDGAR" actually looks like

This is where most institutional data projects quietly stall.

Each issuer has multiple trusts. Toyota alone has issued over 30 auto trust series since 2016. Santander has issued more. Every trust files its own ABS-EE XML every month, independently. To get a complete picture of one issuer's portfolio, you're looking at dozens of XML files per month. For 18 issuers over 9+ years, that's thousands of files just to get started.

The XML isn't trivial to parse. The Reg AB-II schema is well-defined, but real-world filings aren't always clean. Fields are sometimes missing, sometimes null where they shouldn't be, and the spec allows issuers to amend prior filings, which means the same loan can appear with different values across different accession numbers.

Then there's the cross-issuer normalization problem. Even within the same schema, field semantics drift. One issuer reports vehicleValueSourceCode consistently; another leaves it blank for half their trusts. The currentDelinquencyStatus buckets are standardized, but how issuers apply them varies slightly at the edges. None of this is obvious until you're deep in the data.

And then there's infrastructure: you need somewhere to store a billion rows of monthly data, tooling to keep it current as new filings drop, and a process for detecting when a prior-period amendment changes historical values.

We built all of that. It took years and it's still running.


The issuers

We currently cover the following issuer families and shelves under Regulation AB-II:

Issuer Shelf Segment
Ally AMCAR Prime
BMW BBART Prime
Bridgecrest DRIVE Near-prime / Subprime
Capital One COPAR Prime
CarMax CARMX Prime
Carvana CZABS Near-prime
Drive (Westlake) DRIVE Subprime
Exeter EART Subprime
Ford Motor Credit FORDO Prime
Fifth Third FTABS Prime
GM Financial AMCAR/GMCAR Prime
Harley-Davidson HDMOT Specialty
Honda HAROT Prime
Hyundai HAOT Prime
Mercedes-Benz MBALT Prime
Nissan NAROT Prime
Santander SDART Subprime
Toyota TAOT Prime
Volkswagen VWALT Prime
World Omni WOART Prime

Between the captive finance arms (Toyota, Ford, Honda, Hyundai, BMW, Mercedes, Nissan, VW, Harley), the bank issuers (Ally, Capital One, Fifth Third, World Omni), and the non-prime lenders (Santander, Exeter, Drive, Bridgecrest, Carvana), you get a full cross-section of the market. Two additional issuers, California Republic and USAA, are in our config but disabled after their shelves wound down.


The field inventory

The current SEC auto-loan XSD is better understood as one asset-level schema than as a clean static-versus-monthly split. Some elements are origination-oriented, some are reporting-period measures, and some capture servicing, repurchase, and repo lifecycle events. Below, I use the exact SEC variable names because that is what issuers file and what searchers usually type into Google.

Origination-oriented SEC variables

On the borrower side, the variables most people search for are obligorCreditScore, obligorCreditScoreType, obligorEmploymentVerificationCode, obligorIncomeVerificationLevelCode, coObligorIndicator, paymentToIncomePercentage, and obligorGeographicLocation.

For the vehicle, the official SEC names are vehicleManufacturerName, vehicleModelName, vehicleModelYear, vehicleTypeCode, vehicleNewUsedCode, vehicleValueAmount, and vehicleValueSourceCode.

Loan terms and setup fields include originalLoanAmount, originalLoanTerm, originationDate, loanMaturityDate, originalInterestRatePercentage, originalInterestRateTypeCode, interestCalculationTypeCode, originalFirstPaymentDate, paymentTypeCode, underwritingIndicator, and subvented.

Some data buyers search for loanToValueRatio, originalLoanToValueRatio, leaseIndicator, or residualValueAmount. Those are not official element names in the current SEC auto-loan XSD. In LoanTape, LTV is a normalized or derived field built from raw SEC fields such as originalLoanAmount and vehicleValueAmount, while lease residual fields belong to the auto-lease schema, not the auto-loan schema.

Reporting-period SEC variables

Performance fields are reported once per loan per reporting period. A loan originated in 2020 with a 60-month term has 60 rows of monthly data, one for every month it was active. Across 45.8 million loans, that's where the 1.14 billion rows come from.

Each month starts with balance and delinquency status: reportingPeriodBeginningLoanBalanceAmount, reportingPeriodActualEndBalanceAmount, currentDelinquencyStatus, remainingTermToMaturityNumber, reportingPeriodInterestRatePercentage, and nextInterestRatePercentage.

One of the more useful parts of the spec is actual cash collected, reported separately from scheduled: totalActualAmountPaid, actualPrincipalCollectedAmount, actualInterestCollectedAmount, actualOtherCollectedAmount, scheduledPrincipalAmount, scheduledInterestAmount, reportingPeriodScheduledPaymentAmount, and nextReportingPeriodPaymentAmountDue. The gap between scheduled and actual tells you a lot about a pool before the delinquency buckets even move.

Loss and workout fields in the current auto-loan XSD are narrower than many people expect: chargedoffPrincipalAmount, recoveredAmount, repossessedIndicator, repossessedProceedsAmount, zeroBalanceCode, and zeroBalanceEffectiveDate carry most of the lifecycle signal. The current auto-loan schema does not define separate elements named chargeoffDate, chargeoffAmount, liquidationAmount, deficiencyBalanceAmount, or bankruptcyIndicator.

Modification tracking centers on reportingPeriodModificationIndicator, modificationTypeCode, and paymentExtendedNumber. The SEC added a Forbearance enumerated value to modificationTypeCode in ABS v3.1, but the current auto-loan schema does not publish standalone forbearanceIndicator, defermentIndicator, extensionIndicator, or skipPaymentIndicator elements.

Servicing and repurchase fields include servicingAdvanceMethodCode, servicingFeePercentage, servicingFlatFeeAmount, otherServicerFeeRetainedByServicer, otherAssessedUncollectedServicerFeeAmount, servicerAdvancedAmount, primaryLoanServicerName, gracePeriodNumber, assetSubjectDemandIndicator, assetSubjectDemandStatusCode, repurchaseAmount, demandResolutionDate, repurchaserName, and repurchaseReplacementReasonCode.

Across the full SEC auto-loan XSD, that comes to 72 data elements in the current schema.

The restated-field problem

The Reg AB-II spec permits issuers to amend fields after origination: a restated obligorCreditScore, a corrected vehicleValueAmount, an updated originationDate. Over a trust's life, the same loan can appear with different values across different accession numbers, all for what should be a fixed attribute.

This is not a rare edge case. We see it regularly, particularly for obligorCreditScore, vehicleValueAmount, and originationDate. The raw data has no built-in resolution mechanism. If you're joining on the latest filing, you might be using an amended value from a filing that also amended other fields inconsistently.

For every field with documented resolution logic, we track the resolved value, the resolution mode (earliest non-null, last reported, majority, etc.), the source accession number, and the filing date it came from. Every number in the dataset traces to a specific SEC filing. When someone asks where a number came from, we can answer.


The numbers

What Count
Lifetime unique loans 45.8 million
Issuer / shelf rows listed here 20
Monthly performance rows 1.14 billion
Total data points ~74 billion
Coverage start 2016
Max depth per loan 108 months
FICO bands 7 (sub-560 through 720+, Unscored, Missing)

Some loans paid off in 24 months. Subprime paper regularly runs to 72 or 84. Every monthly filing cycle adds tens of millions of new rows.


What you can do with it

Vintage curves are the starting point for most credit work. originationDate, originalLoanAmount, and monthly chargedoffPrincipalAmount are all there, so you can build cumulative loss curves by issuer, vintage year, FICO band, vehicle segment, or whatever cut you need. We pre-aggregate this so you're not writing the SQL from scratch.

Roll rates and cure rates come from joining consecutive months on currentDelinquencyStatus. You get the full Markov transition matrix: what percentage of current loans stay current, how many 30-day loans cure, how many roll to 60. The below-prime version of that analysis is published on the dataset page.

The modification fields are where a lot gets missed. 2020 made this obvious: loans tagged through reportingPeriodModificationIndicator and modificationTypeCode behaved differently from unmodified loans in the same delinquency bucket. Pool-level data gives you one number. This gives you both.

Collateral cuts use vehicleManufacturerName, vehicleModelYear, and vehicleValueAmount, plus LoanTape-derived LTV fields built from official SEC inputs. Used-vehicle loans above 120% LTV have a different loss profile than new-vehicle loans below 90%. You can verify it and segment on it.

For cash flow verification: actual principal and interest collected at the loan level are reported each month. Sum by pool, reconcile against the 10-D distribution figures. When they don't match, the loan-level data tells you where to look.


Common SEC variable names

The full SEC auto-loan XSD has 72 elements. These are the ones people most often search for, with the exact SEC names first and LoanTape aliases or notes in the last column.

Borrower, vehicle, and origination variables

SEC variable What it means LoanTape alias / note
assetNumber Unique loan identifier inside the trust asset_number
originatorName Named originator for the asset originator_name
originationDate Loan origination month origination_date
originalLoanAmount Original principal balance original_loan_amount
originalLoanTerm Original term in months original_loan_term
loanMaturityDate Scheduled maturity month maturity_date
originalInterestRatePercentage Coupon at origination original_interest_rate_percentage
originalInterestRateTypeCode Fixed or variable rate code original_interest_rate_type_code
vehicleManufacturerName Vehicle make or manufacturer vehicle_make
vehicleModelName Vehicle model vehicle_model
vehicleModelYear Model year vehicle_model_year
vehicleNewUsedCode New-versus-used code vehicle_new_used_code
vehicleValueAmount Vehicle value at origination vehicle_value_amount
vehicleValueSourceCode Vehicle valuation source vehicle_value_source_code
obligorCreditScore Borrower credit score at origination obligor_credit_score
obligorCreditScoreType Credit score model name obligor_credit_score_type
obligorIncomeVerificationLevelCode Income verification code obligor_income_verification_level
obligorEmploymentVerificationCode Employment verification code obligor_employment_verification_code
coObligorIndicator Co-borrower indicator co_obligor_indicator
paymentToIncomePercentage Payment-to-income ratio payment_to_income_percentage
obligorGeographicLocation Borrower geography geographic_location
subvented Manufacturer or program subsidy code subvented

Reporting-period balance, cash, and credit-performance variables

SEC variable What it means LoanTape alias / note
reportingPeriodBeginningDate Start date for the reporting period stored as raw SEC field when needed
reportingPeriodEndingDate End date for the reporting period stored as raw SEC field when needed
reportingPeriodBeginningLoanBalanceAmount Beginning balance for the period reporting_period_beginning_loan_balance_amount
reportingPeriodActualEndBalanceAmount Ending balance for the period current_loan_balance_amount
currentDelinquencyStatus Current delinquency bucket or days past due code current_delinquency_status
remainingTermToMaturityNumber Remaining term in months remaining_term_to_maturity_number
reportingPeriodInterestRatePercentage Current period interest rate current_interest_rate_percentage
nextInterestRatePercentage Next scheduled interest rate stored as raw SEC field when needed
reportingPeriodScheduledPaymentAmount Scheduled payment amount for the current period part of LoanTape scheduled-payment normalization
nextReportingPeriodPaymentAmountDue Payment due next period part of LoanTape scheduled-payment normalization
totalActualAmountPaid Total actual cash received total_actual_paid_amount
actualPrincipalCollectedAmount Actual principal collected actual_principal_collected_amount
actualInterestCollectedAmount Actual interest collected actual_interest_collected_amount
actualOtherCollectedAmount Other collected cash actual_other_collected_amount
scheduledPrincipalAmount Scheduled principal due scheduled_principal_amount
scheduledInterestAmount Scheduled interest due scheduled_interest_amount
otherPrincipalAdjustmentAmount Principal adjustments outside scheduled cash other_principal_adjustment_amount
reportingPeriodModificationIndicator Whether the loan was modified in the period modification_indicator
modificationTypeCode Modification type code, including Forbearance as an enum value modification_type_code
paymentExtendedNumber Number of payments extended payment_extended_number
chargedoffPrincipalAmount Principal charged off in the period chargedoff_principal_amount
recoveredAmount Recoveries received in the period recovered_amount

Lifecycle, servicing, repo, and repurchase variables

SEC variable What it means LoanTape alias / note
zeroBalanceCode Why the loan reached zero balance zero_balance_code
zeroBalanceEffectiveDate Effective month of zero balance zero_balance_effective_date
primaryLoanServicerName Current servicer name primary_loan_servicer_name
servicingAdvanceMethodCode Servicer advance method servicer_advancing_method
servicingFeePercentage Servicing fee rate servicing_fee_percentage
servicingFlatFeeAmount Flat servicing fee amount stored as raw SEC field when needed
otherServicerFeeRetainedByServicer Other fee retained by servicer stored as raw SEC field when needed
otherAssessedUncollectedServicerFeeAmount Assessed but uncollected servicer fee other_assessed_uncollected_servicer_fee_amount
servicerAdvancedAmount Amount advanced by the servicer stored as raw SEC field when needed
gracePeriodNumber Grace period count grace_period_number
interestPaidThroughDate Date interest is paid through stored as raw SEC field when needed
mostRecentServicingTransferReceivedDate Most recent servicing transfer date stored as raw SEC field when needed
assetSubjectDemandIndicator Whether the asset is subject to a demand stored as raw SEC field when needed
assetSubjectDemandStatusCode Demand status code stored as raw SEC field when needed
repurchaseAmount Repurchase amount repurchase_amount
demandResolutionDate Resolution date for the demand stored as raw SEC field when needed
repurchaserName Name of the repurchaser stored as raw SEC field when needed
repurchaseReplacementReasonCode Repurchase or replacement reason code stored as raw SEC field when needed
repossessedIndicator Whether the collateral was repossessed repossession_indicator
repossessedProceedsAmount Repo proceeds amount repossessed_proceeds_amount

Common search terms that are not official SEC auto-loan variables

These terms are common in Google searches and data conversations, but they are not element names in the current SEC auto-loan XSD: loanToValueRatio, originalLoanToValueRatio, chargeoffDate, chargeoffAmount, cumulativeRecoveriesAmount, forbearanceIndicator, defermentIndicator, extensionIndicator, skipPaymentIndicator, leaseIndicator, residualValueAmount, balloonIndicator, interestOnlyIndicator, prepaymentPenaltyIndicator, and nextPaymentDueDate.


Access

The ABS-EE subscription includes the current SEC auto-loan variables, plus LoanTape's normalized aliases and derived fields built on top of them. That means you can search by official names like obligorCreditScore and reportingPeriodActualEndBalanceAmount, then still work with cleaner analysis fields once you are inside the product. The monthly performance series and related trust-level cash flow products are available through the ABS remittance data product. Both are queryable via API or Databricks-compatible exports.

If you've looked at building this yourself and decided the engineering cost isn't worth it, that's probably the right call. The raw filing infrastructure exists, the XML is public, and the schema is documented. But the pipeline to turn it into something you can actually query takes years to build and needs to keep running every month. That's what you're paying for.

Pricing is at /pricing. The dataset page has the full field catalog with coverage rates by issuer.