Big Data Engineer - Tiktok Shorttext Recommendation Architecture

  • emory

About the team Our Recommendation Architecture Team is responsible for building up and optimizing the architecture for recommendation system to provide the most stable and best experience for our TikTok users. We cover almost all short-text recommendation scenarios in TikTok, such as search suggestions, the video-related search bar, and comment entities. Our recommendation system supports personalized sorting for queries, optimizing the user experience and improving TikTok's search awareness. Responsibilities- Design and implement a reasonable offline data architecture for large-scale recommendation systems- Design and implement flexible, scalable, stable and high-performance storage and computing systems- Trouble-shooting of the production system, design and implement the necessary mechanisms and tools to ensure the stability of the overall operation of the production system- Build industry-leading distributed systems such as storage and computing to provide reliable infrastructure for massive date and large-scale business systems- Develop and implement techniques and analytics applications to transform raw data into meaningful information using data-oriented programming languages and visualisation software- Applying data mining, data modelling, natural language processing, and machine learning to extract and analyse information from large structured and unstructured datasets- Visualise, interpret, and report data findings and may create dynamic data reports as well Minimum Qualifications- Bachelor's degree or above, majoring in Computer Science, or related fields, with at 3+ years of experience- Familiar with many open source frameworks in the field of big data, . Hadoop, Hive, Flink, FlinkSQL, Spark, Kafka, HBase, Redis, RocksDB, ElasticSearch, etc.- Experience in programming, including but not limited to, the following programming languages: c, C++, Java or Golang- Effective communication skills and a sense of ownership and drive- Experience of Peta Byte level data processing is a plus