展開側邊欄位的漢堡樣式按鈕

已關閉

開發 Web Scraper

案件編號 TK25090510BVMI11 ・2025/10/06 更新

  • 預算金額

    預算詳談

  • 執行地點

    可遠端

  • 接案身份

    不限

  • 需求說明

    We need a free developer in Taiwan to develop a Web Scraper, which is integrated with our API to automatically complete commodity pictures. The demand focuses as follows: 1. API integration • Read our product manifest (CSV/DB) first • If the API already has a complete picture → no action required • If the API lacks pictures or the whole product → Start Scraper crawling 2. Picture capture rules • Only grab pure commodity pictures (no models/real people wearing) • Automatically filter photos with Lookbook, Lifestyle, UGC and other people • Capture the highest resolution / original size picture (avoid thumbnails) 3. Data quality • Use file hash (hash / perceptual hash) to avoid duplication • Each product needs at least 4 pictures with a resolution of ≥1200px • Organize the pictures into a clear folder structure (one folder for each product) 4. Special circumstances that need to be dealt with • JavaScript dynamically loaded pictures (Playwright) • Variant pictures of different colors/sizes of goods • Albums with infinite scrolling or delayed loading • CDN parameters (such as ? width=600) → The original drawing must be retrieved 5. Data update • Update manifest, including: • Status (ok, duplicate, rejected_model, failed) • File path, checksum, source (api / scraper / mixed) • Generate error logs to facilite debugging ⸻ Conditional requirements • Familiar with Python Web Scraping • Familiar with Playwright or Scrapy + Playwright • Experience in image classification/detection (OpenCV, PyTorch or simple ML model) • Can handle repeated inspection (hash + pHash) • Experience in e-commerce platform (Shopify / WooCommerce / Wix) is preferred • Anti-bot processing experience (rate limit, retry mechanism, headers) is better ⸻ Expected deliverables 1. Python Scraper (can be executed on the command line) 2. 按商品整理的資料夾結構 3. 更新後的 manifest (狀態 + 檔案資訊) 4. 測試範例 (20–50 商品) 5. README / 使用說明文件 6. (選用) Dockerfile ⸻ 成功標準 • 所有缺失圖片皆補齊 • 無重複檔案 • 無模特兒 / 真人圖片 • manifest 正確更新,可重複執行不出錯 ⸻ 💰 預算金額 新台幣 4,285 元

登入後即可完整查看