In 2022, the WCO and its Members identified “Technology and Innovation” as one of the three focus areas of the WCO Strategic Plan 2022-2025. For the past few years, China Customs has been implementing the concept of “Smart Customs, Smart Borders and Smart Connectivity”, which established a roadmap towards digital and smart solutions for Customs control, governance and cooperation. Implementing the concept led China Customs to look into ways to integrate big data and artificial intelligence (AI) into its data analytics tools.
It might be useful to start this article by explaining key terms such as data, information, insights, analytics, big data and AI. Customs administrations seek to make confident decisions based on insights gained from information, which is data that has been processed, organized, structured or presented in a meaningful context. Analytics is the creation of insights from data using systematic computational analysis. Big data refers to data sets that contain data of a greater variety (structured and unstructured data, text, audio, video, sensor measurements, social media, and more), of high veracity, arriving in increasing volumes and at greater velocity. As for the term AI, it is used to describe systems which are trained to use information to perform tasks such as identifying patterns or generating new content.
In 2018, a Big Data Management Division was established within China Customs’ Risk Management Department to coordinate Customs data management, establish a unified data architecture, formulate plans, systems, and schemes, and implement them. Teams were also set up in the Customs Districts of Tianjin, Shanghai, Huangpu and Jiangmen to incubate AI-related projects. Staff were diverted from other sections to work together on special projects to develop AI models. Risk recognition and targeting tasks carried out by humans were gradually replaced by computerized processes. After five years of research and development, piloting, and promotion, the approach was adopted in 260 seaports and airports nationwide. In 2023, 22,642 declarations were controlled using the new analytical tools, with a detection rate far exceeding that of human analysis.
Constructing a unified database
China Customs has comprehensively collected multi-source data and formed a data lake with over 15,000 data tables and more than 260 billion data entries in a local centralized manner.
The data sources mainly come from five areas:
- over 300 types of internal Customs data, such as Customs declaration data, manifest data, inspection data, anti-smuggling data and enterprise management data;
- data from other governmental agencies, such as tax data, market data, invoice data, foreign exchange data and passenger data;
- data from Customs administrations;
- commercial data purchased from data service companies;
- publicly available data from the Internet.
These data are stored physically under unified data standards, providing the foundation for analysis and AI applications to find hidden clues and identify risks.
Building data sets for each business
The data lake is then used to create multiple data sets according to a “one business, one data set” model. Based on the characteristics of different Customs businesses such as Customs clearance, supervision, risk control, commodity inspection, tariffs, enterprise management and post-clearance audit, relevant data in the data lake are identified and brought together into a table which serves as a data set for experts to conduct data analysis. At present, China Customs has built over 100 data sets, connecting various system silos from the bottom up, allowing data to flow fully between different systems.
Developing a unified data analysis platform
A platform has been established to facilitate the usage of data by China Customs staff, cultivate a data culture, and enable every officer to become a data analyst. Called “Cloud Engine” in English, it counts over 3,000 daily active users. More than 28,000 analyses are executed every day using one of the analytics models or applications stored on the platform. In 2023, 2,917 fraud cases were identified this way.
Establishing a unified data portal
A data portal has been established to provide a single entry point to all data and all existing data catalogues, and help data users find the data they want.
Building a cluster of intelligent models
While supporting the capacity of officers to conduct data analysis via the “Cloud Engine” platform, China Customs aims at automatizing the risk analysis process by exploring the use of a machine learning model created from algorithms.
Based on the historical data from both internal and external Customs sources, the model utilizes Catboost/XGBoost open-source algorithms to build a program that provides real-time intelligent scoring of risks for each Customs declaration, enterprise and commodity. The model is connected to the Customs risk operation system to achieve risk identification of Customs declarations in real-time Customs clearance.
The principle of the model construction is as follows:
- a database was created from internal and external Customs data, including Customs declaration forms, manifests, enterprise reports, logistics documents, inspection reports, anti-smuggling filing, audit and inspection results, fund flows and insurance information;
- a library of 105 risk characteristics was established, 76 of which are derived from the risk indicators in the WCO Risk Management Compendium. Some risk characteristics were developed based on experts’ experience while others were calculated using algorithms;
- the AI algorithm CatBoost was used to build the model, inputting each Customs declaration with a value. Two thresholds were set: a high-risk threshold, T1, and a low-risk threshold, T2. Declarations with values exceeding the T1 value are intercepted, and the goods are inspected; declarations with predicted values below the T2 value are quickly released; declarations with predicted values between the T1 and T2 values are transferred to experts.
After five years of development and testing, the model has been deployed nationwide. It has been found that the model is more precise than random controls and can discover risks that officers in charge of targeting might not find, such as risks associated with new enterprises or commodities. Finally, the model ensures consistency in controls, unlike targeting personnel, whose capabilities and enforcement standards differ across the country.
To manage the full life cycle of the model, China Customs has established the “Customs Big Data Application Model Management Measures”, which standardize the processes of model research and development, testing, promotion, and decommissioning. At the same time, a model performance evaluation index system has been established to regularly assess the effectiveness of the model and implement dynamic adjustments.
In addition, explorations and trials have been carried out in key areas and at different stages of Customs clearance operations, resulting with the construction of multiple independently developed models. Some are built to identify solid waste, dangerous goods and items bought online. Others screen Customs declarations and certificates. They all form a cluster of models, the goal being to achieve the construction of a model wherever there are Customs operations, use AI models to replace manual operations, and ultimately build a modern, digital and smart Customs Administration empowered by data and models.
After years of exploration and practice, China Customs believes that Customs administrations should not only strengthen the application of AI but also deepen cooperation with each other in these areas. With this in mind, in 2023, together with the WCO Secretariat, it launched the Smart Customs Project to facilitate exchanges and share technological applications and innovative solutions among Customs administrations. Besides information and experience sharing, China Customs believes that collaborative efforts in the field of AI, such as the creation of a platform where Customs administrations can cooperate based on bilateral or multilateral agreements, would be a win-win situation.
More information
Gao Fengrong
gaofr@sina.cn