A lightweight framework-agnostic library in pure PHP for part-of-speech tagging

Giorgio Rey

Last update: Nov 8, 2022

Related tags

Miscellaneous semantic natural-language-processing ai chatbot linguistics parts-of-speech keyword-extraction pos-tagging

Overview

N-ai php pos tagger

A lightweight framework-agnostic library in pure PHP for part-of-speech tagging. Can be used for chatbots, personal assistants, keywords extraction etc. Being written in PHP, it can be easily integrated in pre existent or new applications, giving the real ability to understand what users write.

It is based on vocabularies and predefined grammatical rules, without wrappers to third part systems, neural networks, machine learning or models that requires huge resources.

This is the english version. Documentation and TODO are coming, more info and demo on n-ai.cloud

Precision

In this table I'll put results of differents type of sentences corpus.

Corpus	Total tokens	Correctly tagged	Not correctly tagged	% of total correct
"Just Shoot Me" movie subtitles	3403	3381	22	99,35

Installation

in your project folder e.g. "myproject" install the package via composer;
create folder "dictionaries";
inside folder "dictionaries" clone or download the english dictionary repository;
run this example script:

use NaiPosTagger\Pipelines\PipelinePosTagging;
use NaiPosTagger\Models\NaiPosArr;


include('vendor/autoload.php');

include(__DIR__ . '/vendor/nai-php/naipostagger/src/Utilities/common_functions_helper.php');

define('DICTIONARIES_PATH', __DIR__ . '/./dictionaries/dictionaries-');

define('TRAITS_PATH', __DIR__ . '/./vendor/nai-php/naipostagger/src/');

$sentence = 'my name is Fred';

$PipelinePosTagging = new PipelinePosTagging();

$PipelinePosTagging->language = 'en';

$pos_arr = $PipelinePosTagging->transform($sentence);

// for a clear output, better hide metadata
$pos_arr = NaiPosArr::clearMetadata($pos_arr);

// and further simplify the output
$pos_arr = NaiPosArr::flatPosArr($pos_arr);

diex($pos_arr);

And the output will be:

Array
(
    [0] => Array
        (
            [form] => .
            [lemma] => .
            [features] => SENT
            [sh-feat] => SENT
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [1] => Array
        (
            [form] => my
            [lemma] => my
            [features] => ADJ:pos+m+s
            [sh-feat] => ADJ
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [2] => Array
        (
            [form] => name
            [lemma] => name
            [features] => NOUN-m:s
            [sh-feat] => NOUN
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [3] => Array
        (
            [form] => is
            [lemma] => is
            [features] => VER:ind+pres+3+s
            [sh-feat] => VER
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [4] => Array
        (
            [form] => Fred
            [lemma] => Fred
            [features] => NPR
            [sh-feat] => NPR
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

    [5] => Array
        (
            [form] => .
            [lemma] => .
            [features] => SENT
            [sh-feat] => SENT
            [label] => 
            [rule] => 
            [pos_score] => 0
        )

)

To do list

Find contributors
Clean, check, fix and tag term in dictionaries
Clean, check, fix brill rules
Add more ngrams
Add more tests, expecially for filters
Collect and load frill words
Better Oop for some classes?
In module for logical analysis (yet not published) collect synonyms and temporal expressions

You might also like...

A pure PHP library for reading and writing presentations documents

Comments

Including the common function file directly in composer
No reason to do

include(__DIR__ . '/vendor/nai-php/naipostagger/src/Utilities/common_functions_helper.php');

Files can also be added to composer directly
opened by lsv 1

A lightweight framework-agnostic library in pure PHP for part-of-speech tagging

Related tags

Overview

N-ai php pos tagger

Precision

Installation

To do list

You might also like...

A pure PHP library for reading and writing presentations documents

A pure PHP implementation of the MessagePack serialization format / msgpack.org[PHP]

A pure PHP implementation of the open Language Server Protocol. Provides static code analysis for PHP for any IDE.

Contains a few tools usefull for making your test-expectations agnostic to operating system specifics

WordPress plugin which contains a collection of modules to apply theme-agnostic front-end modifications

Faker-driven, configuration-based, platform-agnostic, locale-compatible data faker tool

A beautiful, fully open-source, tunneling service - written in pure PHP

Neural Network in pure PHP

APM pure Php package

Comments

Including the common function file directly in composer

Releases(v0.2)

v0.2(Aug 22, 2021)

Owner

Giorgio Rey

A PHP library to convert text to speech using various services

A framework agnostic PHP library to build chat bots

A framework agnostic, multi-gateway payment processing library for PHP 5.6+

Group of projects completed by me as a part of Intern at LGM

Exploiting and fixing security vulnerabilities of an old version of E-Class. Project implemented as part of the class YS13 Cyber-Security.

Applies a patch from a local or remote file to any package that is part of a given composer project.

Staged Payloads from Kali Linux - Part 1,2 of 3

Simple opinionated framework agnostic PHP 8.1 enum helper

⚡ Php snippets, random stuff, demos, functions, fast message system, agnostic and framework free - 100% compactible ;) ⚡

Columnar analytics for PHP - a pure PHP library to read and write simple columnar files in a performant way.