A lightweight framework-agnostic library in pure PHP for part-of-speech tagging

Overview

N-ai php pos tagger

A lightweight framework-agnostic library in pure PHP for part-of-speech tagging. Can be used for chatbots, personal assistants, keywords extraction etc. Being written in PHP, it can be easily integrated in pre existent or new applications, giving the real ability to understand what users write.

It is based on vocabularies and predefined grammatical rules, without wrappers to third part systems, neural networks, machine learning or models that requires huge resources.

This is the english version. Documentation and TODO are coming, more info and demo on n-ai.cloud

Precision

In this table I'll put results of differents type of sentences corpus.

Corpus Total tokens Correctly tagged Not correctly tagged % of total correct
"Just Shoot Me" movie subtitles 3403 3381 22 99,35

Installation

  1. in your project folder e.g. "myproject" install the package via composer;

  2. create folder "dictionaries";

  3. inside folder "dictionaries" clone or download the english dictionary repository;

  4. run this example script:

use NaiPosTagger\Pipelines\PipelinePosTagging;
use NaiPosTagger\Models\NaiPosArr;


include('vendor/autoload.php');

include(__DIR__ . '/vendor/nai-php/naipostagger/src/Utilities/common_functions_helper.php');

define('DICTIONARIES_PATH', __DIR__ . '/./dictionaries/dictionaries-');

define('TRAITS_PATH', __DIR__ . '/./vendor/nai-php/naipostagger/src/');

$sentence = 'my name is Fred';

$PipelinePosTagging = new PipelinePosTagging();

$PipelinePosTagging->language = 'en';

$pos_arr = $PipelinePosTagging->transform($sentence);

// for a clear output, better hide metadata
$pos_arr = NaiPosArr::clearMetadata($pos_arr);

// and further simplify the output
$pos_arr = NaiPosArr::flatPosArr($pos_arr);

diex($pos_arr);

And the output will be:

Array
(
    [0] => Array
        (
            [form] => .
            [lemma] => .
            [features] => SENT
            [sh-feat] => SENT
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [1] => Array
        (
            [form] => my
            [lemma] => my
            [features] => ADJ:pos+m+s
            [sh-feat] => ADJ
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [2] => Array
        (
            [form] => name
            [lemma] => name
            [features] => NOUN-m:s
            [sh-feat] => NOUN
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [3] => Array
        (
            [form] => is
            [lemma] => is
            [features] => VER:ind+pres+3+s
            [sh-feat] => VER
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [4] => Array
        (
            [form] => Fred
            [lemma] => Fred
            [features] => NPR
            [sh-feat] => NPR
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [5] => Array
        (
            [form] => .
            [lemma] => .
            [features] => SENT
            [sh-feat] => SENT
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

)

To do list

  • Find contributors
  • Clean, check, fix and tag term in dictionaries
  • Clean, check, fix brill rules
  • Add more ngrams
  • Add more tests, expecially for filters
  • Collect and load frill words
  • Better Oop for some classes?
  • In module for logical analysis (yet not published) collect synonyms and temporal expressions
You might also like...
A pure PHP library for reading and writing presentations documents
A pure PHP library for reading and writing presentations documents

Branch Master : Branch Develop : PHPPresentation is a library written in pure PHP that provides a set of classes to write to different presentation fi

A pure PHP implementation of the MessagePack serialization format / msgpack.org[PHP]

msgpack.php A pure PHP implementation of the MessagePack serialization format. Features Fully compliant with the latest MessagePack specification, inc

A pure PHP implementation of the open Language Server Protocol. Provides static code analysis for PHP for any IDE.
A pure PHP implementation of the open Language Server Protocol. Provides static code analysis for PHP for any IDE.

A pure PHP implementation of the open Language Server Protocol. Provides static code analysis for PHP for any IDE.

Contains a few tools usefull for making your test-expectations agnostic to operating system specifics

PHPUnit Tools to ease cross operating system Testing make assertEquals* comparisons end-of-line (aka PHP_EOL) character agnostic Make use of EolAgnost

WordPress plugin which contains a collection of modules to apply theme-agnostic front-end modifications

Soil A WordPress plugin which contains a collection of modules to apply theme-agnostic front-end modifications. Soil is a commercial plugin available

Faker-driven, configuration-based, platform-agnostic, locale-compatible data faker tool
Faker-driven, configuration-based, platform-agnostic, locale-compatible data faker tool

Masquerade Faker-driven, platform-agnostic, locale-compatible data faker tool Point Masquerade to a database, give it a rule-set defined in YAML and M

A beautiful, fully open-source, tunneling service - written in pure PHP
A beautiful, fully open-source, tunneling service - written in pure PHP

Expose A completely open-source ngrok alternative - written in pure PHP. Documentation For installation instructions, in-depth usage and deployment de

Neural Network in pure PHP
Neural Network in pure PHP

rn Neural Network in pure PHP - ML Machine Learning - AI Artificial Intelligence RED NEURONAL WHAT DO THIS LIBRARY IN PURE PHP OF ARTIFICIAL INTELLIGE

APM pure Php package

Pure Php sample project is just a sample project and here is what I usualy do for each new Php project.

Comments
Releases(v0.2)
Owner
Giorgio Rey
A passionate PHP enthusiast
Giorgio Rey
A PHP library to convert text to speech using various services

speaker A PHP library to convert text to speech using various services

Craig Duncan 98 Nov 27, 2022
A framework agnostic PHP library to build chat bots

BotMan If you want to learn how to create reusable PHP packages yourself, take a look at my upcoming PHP Package Development video course. About BotMa

BotMan 5.8k Jan 1, 2023
A framework agnostic, multi-gateway payment processing library for PHP 5.6+

Omnipay An easy to use, consistent payment processing library for PHP Omnipay is a payment processing library for PHP. It has been designed based on i

The League of Extraordinary Packages 5.7k Dec 30, 2022
Group of projects completed by me as a part of Intern at LGM

LGMVIP-Projects Group of projects completed by me as a part of Intern at LGM Author Details: Name : MAINAK CHAUDHURI Position : Web Developer Intern,

MAINAK CHAUDHURI 25 Dec 17, 2022
Exploiting and fixing security vulnerabilities of an old version of E-Class. Project implemented as part of the class YS13 Cyber-Security.

Open eClass 2.3 Development of XSS, CSRF, SQLi, RFI attacks/defences of an older,vulnerable version of eclass. Project implemented as part of the clas

Aristi_Papastavrou 11 Apr 23, 2022
Applies a patch from a local or remote file to any package that is part of a given composer project.

Applies a patch from a local or remote file to any package that is part of a given composer project. Patches can be defined both on project and on package level in package config or separate JSON file. Declaration-free mode (using embedded info within patch files) is available as well.

Vaimo 245 Dec 15, 2022
Staged Payloads from Kali Linux - Part 1,2 of 3

PT Phone Home As penetration testers, we often come up with creative methods to deliver and execute our payloads, such as staged payloads. A staged pa

Tristram 14 Dec 19, 2022
Simple opinionated framework agnostic PHP 8.1 enum helper

Enum Helper A simple and opinionated collections of PHP 8.1 enum helpers inspired by archtechx/enums and BenSampo/laravel-enum. This package is framew

Datomatic 52 Jan 1, 2023
⚡ Php snippets, random stuff, demos, functions, fast message system, agnostic and framework free - 100% compactible ;) ⚡

⚡ Php8 FPM Nginx Fast, Scripts, Pearls & Treasures ?? Want to run and test asap ? docker-compose up -d phpgit_php8;ip=$(docker-machine ip default);ech

Benjamin FONTAINE 0 Mar 20, 2022
Columnar analytics for PHP - a pure PHP library to read and write simple columnar files in a performant way.

Columnar Analytics (in pure PHP) On GitHub: https://github.com/envoymediagroup/columna About the project What does it do? This library allows you to w

Envoy Media Group 2 Sep 26, 2022