Skip to main content

Command Palette

Search for a command to run...

Deliver CLS Logs to Tencent Cloud DLC for Spark-Based Analysis

How to configure CLS log delivery into Data Lake Compute, map fields and partitions, and choose between DLC, Ckafka, and COS delivery paths.

Updated
5 min read
Deliver CLS Logs to Tencent Cloud DLC for Spark-Based Analysis

Log platforms often start with search and alerting, then grow into data processing and analytics workflows. Tencent Cloud Log Service (CLS) already supports delivery to Ckafka and COS. The source workflow adds another delivery target: Tencent Cloud Data Lake Compute (DLC).

With CLS-to-DLC delivery, logs from a CLS topic can be written directly into DLC so teams can use Spark for processing and analysis.

Choose the Right Delivery Target

CLS delivery paths serve different downstream workloads.

Delivery target Best fit
DLC Big data compute and offline analysis. The source material highlights Spark, Spark Streaming, MLlib, and GraphX scenarios.
Ckafka Real-time stream computing and event pipelines.
COS Long-term archiving and log audit storage.

DLC is the right target when logs need to participate in Spark-based analytics rather than only operational search or archive retention.

Why DLC Matters for Log Analytics

The source material calls out two advantages of DLC for log processing:

DLC advantage Practical value
Real-time stream processing Spark Streaming can be used for real-time analysis over delivered log data.
Spark ecosystem MLlib supports machine learning workflows, and GraphX supports graph analysis such as analyzing user relationships in a social network.

This makes CLS-to-DLC delivery useful when log data needs to move from operations visibility into data-lake analytics.

Configuration Workflow

Use this sequence in the CLS console.

  1. Log in to the Tencent Cloud CLS console.
  2. Open the log topic that should deliver data to DLC.
  3. In the left-side navigation, select DLC delivery.
  4. Select the target DLC database and table.
  5. Configure data field mapping.
  6. Configure partition field mapping.
  7. Confirm the parameters and submit the task.

After confirmation, CLS creates the delivery task that writes the log topic data to DLC.

Map Data Fields

Field mapping defines how CLS log fields land in the DLC table.

Mapping case Configuration behavior
CLS and DLC fields share the same name The fields can be mapped automatically.
CLS and DLC fields use different names Enter the CLS log field manually and map it to the target DLC field.
Multiple data types are required Use the DLC table field types supported by the configuration.

The source material points readers to the DLC common data type documentation for exact type selection. In the delivery task itself, the important operational rule is simple: make field names and types explicit before the task is enabled.

Map Partition Fields

Partition mapping controls how delivered logs are organized in DLC.

Partition strategy How to configure it
Time-based partitioning Use the CLS log time field as the partition mapping source.
Additional business partitions Choose the corresponding log field, such as a service, tenant, region, or application field when available.
No partition mapping Turn off the partition mapping switch.

Time partitioning is usually the safest default because it matches log data lifecycle and analytical windows. Additional fields should be used only when they are stable enough to support query pruning and downstream table organization.

Validation Checklist

Before treating the pipeline as production-ready, verify:

  • The CLS topic is the intended source topic.
  • The target DLC database and table are correct.
  • Every required DLC field has a matching CLS log field.
  • Fields with different names are mapped manually.
  • Data types are compatible with the DLC table.
  • Partition mapping uses the CLS log time field when time partitioning is required.
  • Optional partition fields are stable and meaningful.
  • The task can be submitted successfully.

How This Fits a Log Data Architecture

CLS remains the operational log topic. Delivery extends the same data into downstream analytics systems:

Layer Role
CLS log topic Collects and stores operational log data.
Delivery task Moves data from the topic into the selected downstream system.
DLC table Stores log data for Spark-based processing and analysis.
Spark jobs Run real-time or offline analytics, machine learning, or graph processing.

This architecture lets teams keep operational search in CLS while opening the same log stream to data-lake analysis.

FAQ

When should I use DLC instead of Ckafka or COS?

Use DLC when Spark-based analysis is the goal. Use Ckafka for streaming pipelines and COS for long-term archive or audit storage.

Do fields map automatically?

Only when CLS and DLC fields share the same name. When names differ, configure the source log field manually.

Is partition mapping required?

No. You can turn off partition mapping. For analytical workloads, time partitioning with the CLS log time field is often useful because most log analysis is time-window based.