CSbyGB - Pentips
Buy me a tea
  • CS By GB - PenTips
    • Welcome to CSbyGB's Pentips
  • Networking, Protocols and Network pentest
    • Basics
    • DNS
    • FTP
    • HTTP & HTTPS
    • IMAP
    • IPMI
    • MSSQL
    • MYSQL
    • NFS
    • Oracle TNS
    • POP3
    • RDP
    • RPC
    • Rservices
    • Rsync
    • SMB
    • SMTP
    • SNMP
    • SSH
    • VOIP and related protocols
    • Winrm
    • WMI
    • Useful tips when you find unknown ports
  • Ethical Hacking - General Methodology
    • Introduction
    • Information Gathering
    • Scanning & Enumeration
    • Exploitation (basics)
    • Password Attacks
    • Post Exploitation
    • Lateral Movement
    • Proof-of-Concept
    • Post-Engagement
    • MITRE ATT&CK
  • External Pentest
    • External Pentest
  • Web Pentesting
    • Introduction to HTTP and web
    • Enumeration
    • OWASP Top 10
    • General Methodo & Misc Tips
    • Web Services and API
    • Vunerabilities and attacks
      • Clickjacking
      • CORS (Misconfigurations)
      • CSRF
      • SSRF
      • Bypass captcha
      • Template Injection (client and server side)
      • MFA bypass
      • XXE
    • Exposed git folder
    • Docker exploitation and Docker vulnerabilities
    • Websockets
  • Mobile App Pentest
    • Android
    • IOS
  • Wireless Pentest
    • Wireless pentest
  • Cloud Pentest
    • Cloud Pentest
    • Google Cloud Platform
    • AWS
  • Thick Client Pentest
    • Thick Client
  • Hardware Pentest
    • ATM
    • IoT
  • Secure Code Review
    • Secure code review
    • Java notes for Secure Code Review
  • AI & AI Pentest
    • MITRE ATLAS
    • OWASP ML and LLM
    • Hugging face
    • AI Python
    • Gemini
    • Ollama
  • Checklist
    • Web Application and API Pentest Checklist
    • Linux Privesc Checklist
    • Mobile App Pentest Checklist
  • Tools
    • Burpsuite
    • Android Studio
    • Frida
    • CrackMapExec
    • Netcat and alternatives
    • Nmap
    • Nuclei
    • Evil Winrm
    • Metasploit
    • Covenant
    • Mimikatz
    • Passwords, Hashes and wordlist tools
    • WFuzz
    • WPScan
    • Powershell Empire
    • Curl
    • Vulnerability Scanning tools
    • Payload Tools
    • Out of band Servers
    • STEWS
    • Webcrawlers
    • Websocat
  • VM and Labs
    • General tips
    • Setup your pentest lab
  • Linux
    • Initial Foothold
    • Useful commands and tools for pentest on Linux
    • Privilege Escalation
      • Kernel Exploits
      • Password and file permission
      • Sudo
      • SUID
      • Capabilities
      • Scheduled tasks
      • NFS Root Squashing
      • Services
      • PATH Abuse
      • Wildcard Abuse
      • Privileged groups
      • Exploit codes Cheat Sheet
  • Windows
    • Offensive windows
    • Enumeration and general Win tips
    • Privilege Escalation
    • Active Directory
    • Attacking Active Directory
      • LLMNR Poisoning
      • SMB Relay Attacks
      • Shell Access
      • IPv6 Attacks
      • Passback Attacks
      • Abusing ZeroLogon
    • Post-Compromise Enumeration
      • Powerview or SharpView (.NET equivalent)
      • AD Manual Enumeration
      • Bloodhound
      • Post Compromise Enumeration - Resources
    • Post Compromise Attacks
      • Pass the Password / Hash
      • Token Impersonation - Potato attacks
      • Kerberos
      • GPP/cPassword Attacks
      • URL File Attack
      • PrintNightmare
      • Printer Bug
      • AutoLogon exploitation
      • Always Installed Elevated exploitation
      • UAC Bypass
      • Abusing ACL
      • Unconstrained Delegation
    • Persistence
    • AV Evasion
    • Weaponization
    • Useful commands in Powershell, CMD and Sysinternals
    • Windows Internals
  • Programming
    • Python programming
    • My scripts
    • Kotlin
  • Binary Exploitation
    • Assembly
    • Buffer Overflow - Stack based - Winx86
    • Buffer Overflow - Stack based - Linux x86
  • OSINT
    • OSINT
    • Create an OSINT lab
    • Sock Puppets
    • Search engines
    • OSINT Images
    • OSINT Email
    • OSINT Password
    • OSINT Usernames
    • OSINT People
    • OSINT Social Media
    • OSINT Websites
    • OSINT Business
    • OSINT Wireless
    • OSINT Tools
    • Write an OSINT report
  • Pentester hardware toolbox
    • Flipper Zero
    • OMG cables
    • Rubber ducky
  • Post Exploitation
    • File transfers between target and attacking machine
    • Maintaining Access
    • Pivoting
    • Cleaning up
  • Reporting
    • How to report your findings
  • Red Team
    • Red Team
    • Defenses Enumeration
    • AV Evasion
  • Writeups
    • Hackthebox Tracks
      • Hackthebox - Introduction to Android Exploitation - Track
    • Hackthebox Writeups
      • Hackthebox - Academy
      • Hackthebox - Access
      • Hackthebox - Active
      • Hackthebox - Ambassador
      • Hackthebox - Arctic
      • Hackthebox - Awkward
      • Hackthebox - Backend
      • Hackthebox - BackendTwo
      • Hackthebox - Bastard
      • Hackthebox - Bastion
      • Hackthebox - Chatterbox
      • Hackthebox - Devel
      • Hackthebox - Driver
      • Hackthebox - Explore
      • Hackthebox - Forest
      • Hackthebox - Good games
      • Hackthebox - Grandpa
      • Hackthebox - Granny
      • Hackthebox - Inject
      • Hackthebox - Jeeves
      • Hackthebox - Jerry
      • Hackthebox - Lame
      • Hackthebox - Late
      • Hackthebox - Love
      • Hackthebox - Mentor
      • Hackthebox - MetaTwo
      • Hackthebox - Monteverde
      • Hackthebox - Nibbles
      • Hackthebox - Optimum
      • Hackthebox - Paper
      • Hackthebox - Photobomb
      • Hackthebox - Poison
      • Hackthebox - Precious
      • Hackthebox - Querier
      • Hackthebox - Resolute
      • Hackthebox - RouterSpace
      • Hackthebox - Sauna
      • Hackthebox - SecNotes
      • Hackthebox - Shoppy
      • Hackthebox - Soccer
      • Hackthebox - Steamcloud
      • Hackthebox - Toolbox
      • Hackthebox - Vault
      • Hackthebox - Updown
    • TryHackme Writeups
      • TryHackMe - Anonymous
      • TryHackMe - Blaster
      • TryHackMe - CMesS
      • TryHackMe - ConvertMyVideo
      • TryHackMe - Corridor
      • TryHackMe - LazyAdmin
      • TryHackMe - Looking Glass
      • TryHackMe - Nahamstore
      • TryHackMe - Overpass3
      • TryHackMe - OWASP Top 10 2021
      • TryHackMe - SimpleCTF
      • TryHackMe - SQL Injection Lab
      • TryHackMe - Sudo Security Bypass
      • TryHackMe - Tomghost
      • TryHackMe - Ultratech
      • TryHackMe - Vulnversity
      • TryHackMe - Wonderland
    • Vulnmachines Writeups
      • Web Labs Basic
      • Web Labs Intermediate
      • Cloud Labs
    • Mobile Hacking Lab
      • Mobile Hacking Lab - Lab - Config Editor
      • Mobile Hacking Lab - Lab - Strings
    • Portswigger Web Security Academy Writeups
      • PS - DomXSS
      • PS - Exploiting vulnerabilities in LLM APIs
    • OWASP projects and challenges writeups
      • OWASP MAS Crackmes
    • Vulnerable APIs
      • Vampi
      • Damn Vulnerable Web Service
      • Damn Vulnerable RESTaurant
    • Various Platforms
      • flAWS 1&2
  • Digital skills
    • How to make a gitbook
    • Marp
    • Linux Tips
    • Docker
    • VSCodium
    • Git Tips
    • Obsidian
  • Durable skills
    • Durable skills wheel/Roue des compétences durables
  • Projects
    • Projects
      • Technical Projects
      • General Projects
  • Talks
    • My Talks about Web Pentest
    • My talks about Android Application hacking
    • Other of my talks and Podcast
  • Resources
    • A list of random resources
Powered by GitBook
On this page
  • What is NLP?
  • What is hugging face?
  • Installation
  • Transformers
  • Modal
  • Pipeline
  • Resources
  1. AI & AI Pentest

Hugging face

PreviousOWASP ML and LLMNextAI Python

Last updated 6 months ago

What is NLP?

Natural Language Processing, or NLP, is a field of artificial intelligence (AI) that focuses on enabling computers to understand, interpret, and interact with human language. Essentially, it’s about teaching computers to "read," "write," and "understand" text or speech, just like humans do.

NLP powers many everyday applications, like:

  • Chatbots that understand and respond to questions

  • Voice assistants like Siri or Alexa that recognize spoken words and carry out tasks

  • Spam filters that identify unwanted emails based on their content

  • Translation apps that convert text from one language to another

  • Sentiment analysis that identifies emotions in text, like spotting a positive or negative review

For example ChatGPT is an NLP Model

What is hugging face?

Hugging Face is a company and an open-source community that specializes in developing and hosting machine learning models, particularly in natural language processing (NLP). They are known for their user-friendly tools and an ecosystem that makes machine learning accessible for developers, researchers, and businesses. Some key elements of Hugging Face include:

  1. Transformers Library: This open-source library offers pre-trained models for tasks like text classification, translation, summarization, and question-answering. It supports popular frameworks like PyTorch, TensorFlow, and JAX, and simplifies working with models like BERT, GPT, T5, and more.

  2. Hub: The Hugging Face Hub hosts thousands of models shared by the community and Hugging Face itself. It's a repository where anyone can find, upload, or deploy models and datasets for various machine-learning tasks. It also allows developers to build and fine-tune models easily.

  3. Datasets Library: A library of diverse, high-quality datasets for machine learning. The library includes datasets from various domains and languages, helping developers train models on specific tasks without manually sourcing data.

  4. Spaces: Hugging Face Spaces is a feature that enables users to create and share demos and applications of machine learning models with ease. It's popular for deploying models with Gradio or Streamlit, making it easy to showcase ML models interactively.

  5. Inference API: Hugging Face provides APIs for deploying models in production environments. Users can host models on Hugging Face’s servers, which handle the backend infrastructure, allowing for fast integration of models into applications.

  6. Community and Research: Hugging Face has become a central place for the ML community, supporting open science and democratizing AI by hosting workshops, challenges, and community events.

Installation

Transformers

Transformers are a type of model architecture that has revolutionized how machines process and understand text. Developed by researchers at Google in 2017, the Transformer model introduced a new way for models to handle language data, and it became the foundation for many advanced language models, including GPT, BERT, T5, and others.

  • How to install

mkdir transformers-env
cd transformers-env
python3 -m venv .
source ./bin/activate
pip install transformers

Modal

"modAL is an active learning framework for Python3, designed with modularity, flexibility and extensibility in mind. Built on top of scikit-learn, it allows you to rapidly create active learning workflows with nearly complete freedom. What is more, you can easily replace parts with your custom built solutions, allowing you to design novel algorithms with ease."

pip install modal
python -m modal setup

Pipeline

A pipeline is a tool that chains together a series of steps, simplifying complex workflows so that they run in a sequence. Think of it as an assembly line where data flows through multiple stages, each performing a specific task. In NLP, pipelines are commonly used to streamline tasks like text classification, translation, or summarization by automating the stages required to process and interpret the data.

Resources

Full documentation

Hugging face
here
Official documentation
Hugging Face 101: Your First Steps Coding with Transformers - Harper Carroll AI
Getting Started With Hugging Face in 15 Minutes | Transformers, Pipeline, Tokenizer, Models - Assembly AI