This is a php parser for plantuml source file.

Overview

PlantUML parser for PHP

Overview

This package builds AST of class definitions from plantuml files. This package works only with php.

Installation

Via Composer

composer require puml2php/puml-parser

Usage

sample PlantUML source file.

@startuml
package Lexer {
    interface Tokenizeable
    package Lexer/Arrow {
        abstract class ArrowTokenizer implements Tokenizeable
        class LeftArrowTokenizer {
            + publicProperty : array
            # protectedProperty : string
            - privateProperty
        }
    }

    enum Enum {
      CASE1
      CASE2
      CASE3
    }

    LeftArrowTokenizer--|>ArrowTokenizer
    NoneDefinitionClass ..|> Tokenizeable
}
@enduml

Basically, it is assumed that each class definition will be manipulated after it is converted to DTO.

<?php

use PumlParser\Lexer\Lexer;
use PumlParser\Lexer\PumlTokenizer;
use PumlParser\Parser\Parser;

$lexer  = new Lexer(new PumlTokenizer());
$parser = new Parser($lexer);
$ast    = $parser->parse(__DIR__ . '/sample.puml');

foreach ($ast->toDtos() as $definition) {
    echo "----------\n";

    echo "name: " . $definition->getName() . "\n";
    echo "package: " . $definition->getPackage() . "\n";

    if ($definition->getType() === 'enum') {
        foreach ($definition->getCases() as $case) {
            echo "case: " . $case . "\n";
        }
    } else {
        foreach ($definition->getProperties() as $property) {
            echo "property name: " . $property->getName() . " , visibility:  " . $property->getVisibility() . "\n";
        }
    }
}
$ php sample.php
----------
name: Tokenizeable
package: Lexer
----------
name: ArrowTokenizer
package: Lexer\Arrow
----------
name: LeftArrowTokenizer
package: Lexer\Arrow
property name: publicProperty , visibility:  public
property name: protectedProperty , visibility:  protected
property name: privateProperty , visibility:  private
----------
name: Enum
package: Lexer
case: CASE1
case: CASE2
case: CASE3
----------
name: NoneDefinitionClass
package: Lexer

Support for three parsing results. They are json, array, and Dto.

<?php

use PumlParser\Lexer\Lexer;
use PumlParser\Lexer\PumlTokenizer;
use PumlParser\Parser\Parser;

$lexer  = new Lexer(new PumlTokenizer());
$parser = new Parser($lexer);
$ast    = $parser->parse(__DIR__ . '/sample.puml');
dump $ast->toDtos()
array(4) {
  [0]=>
  object(PumlParser\Dto\Definition)#59 (6) {
    ["name":"PumlParser\Dto\Definition":private]=>
    string(12) "Tokenizeable"
    ["type":"PumlParser\Dto\Definition":private]=>
    string(9) "interface"
    ["package":"PumlParser\Dto\Definition":private]=>
    string(5) "Lexer"
    ["properties":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
    ["parents":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
    ["interfaces":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
  }
  [1]=>
  object(PumlParser\Dto\Definition)#62 (6) {
    ["name":"PumlParser\Dto\Definition":private]=>
    string(14) "ArrowTokenizer"
    ["type":"PumlParser\Dto\Definition":private]=>
    string(14) "abstract class"
    ["package":"PumlParser\Dto\Definition":private]=>
    string(11) "Lexer\Arrow"
    ["properties":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
    ["parents":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
    ["interfaces":"PumlParser\Dto\Definition":private]=>
    array(1) {
      [0]=>
      object(PumlParser\Dto\Definition)#46 (6) {
        ["name":"PumlParser\Dto\Definition":private]=>
        string(12) "Tokenizeable"
        ["type":"PumlParser\Dto\Definition":private]=>
        string(9) "interface"
        ["package":"PumlParser\Dto\Definition":private]=>
        string(5) "Lexer"
        ["properties":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
        ["parents":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
        ["interfaces":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
      }
    }
  }
  [2]=>
  object(PumlParser\Dto\Definition)#61 (6) {
    ["name":"PumlParser\Dto\Definition":private]=>
    string(18) "LeftArrowTokenizer"
    ["type":"PumlParser\Dto\Definition":private]=>
    string(5) "class"
    ["package":"PumlParser\Dto\Definition":private]=>
    string(11) "Lexer\Arrow"
    ["properties":"PumlParser\Dto\Definition":private]=>
    array(3) {
      [0]=>
      object(PumlParser\Dto\PropertyDefinition)#34 (2) {
        ["name":"PumlParser\Dto\PropertyDefinition":private]=>
        string(14) "publicProperty"
        ["visibility":"PumlParser\Dto\PropertyDefinition":private]=>
        string(6) "public"
      }
      [1]=>
      object(PumlParser\Dto\PropertyDefinition)#33 (2) {
        ["name":"PumlParser\Dto\PropertyDefinition":private]=>
        string(17) "protectedProperty"
        ["visibility":"PumlParser\Dto\PropertyDefinition":private]=>
        string(9) "protected"
      }
      [2]=>
      object(PumlParser\Dto\PropertyDefinition)#60 (2) {
        ["name":"PumlParser\Dto\PropertyDefinition":private]=>
        string(15) "privateProperty"
        ["visibility":"PumlParser\Dto\PropertyDefinition":private]=>
        string(7) "private"
      }
    }
    ["parents":"PumlParser\Dto\Definition":private]=>
    array(1) {
      [0]=>
      object(PumlParser\Dto\Definition)#26 (6) {
        ["name":"PumlParser\Dto\Definition":private]=>
        string(14) "ArrowTokenizer"
        ["type":"PumlParser\Dto\Definition":private]=>
        string(14) "abstract class"
        ["package":"PumlParser\Dto\Definition":private]=>
        string(11) "Lexer\Arrow"
        ["properties":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
        ["parents":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
        ["interfaces":"PumlParser\Dto\Definition":private]=>
        array(1) {
          [0]=>
          object(PumlParser\Dto\Definition)#57 (6) {
            ["name":"PumlParser\Dto\Definition":private]=>
            string(12) "Tokenizeable"
            ["type":"PumlParser\Dto\Definition":private]=>
            string(9) "interface"
            ["package":"PumlParser\Dto\Definition":private]=>
            string(5) "Lexer"
            ["properties":"PumlParser\Dto\Definition":private]=>
            array(0) {
            }
            ["parents":"PumlParser\Dto\Definition":private]=>
            array(0) {
            }
            ["interfaces":"PumlParser\Dto\Definition":private]=>
            array(0) {
            }
          }
        }
      }
    }
    ["interfaces":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
  }
  [3]=>
  object(PumlParser\Dto\Definition)#41 (6) {
    ["name":"PumlParser\Dto\Definition":private]=>
    string(19) "NoneDefinitionClass"
    ["type":"PumlParser\Dto\Definition":private]=>
    string(5) "class"
    ["package":"PumlParser\Dto\Definition":private]=>
    string(5) "Lexer"
    ["properties":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
    ["parents":"PumlParser\Dto\Definition":private]=>
    array(0) {
    }
    ["interfaces":"PumlParser\Dto\Definition":private]=>
    array(1) {
      [0]=>
      object(PumlParser\Dto\Definition)#56 (6) {
        ["name":"PumlParser\Dto\Definition":private]=>
        string(12) "Tokenizeable"
        ["type":"PumlParser\Dto\Definition":private]=>
        string(9) "interface"
        ["package":"PumlParser\Dto\Definition":private]=>
        string(5) "Lexer"
        ["properties":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
        ["parents":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
        ["interfaces":"PumlParser\Dto\Definition":private]=>
        array(0) {
        }
      }
    }
  }
}
dump $ast->toJson()
[
    {
        "interface": {
            "Name": "Tokenizeable",
            "Package": "Lexer",
            "Propaties": [],
            "Parents": [],
            "Interfaces": []
        }
    },
    {
        "abstract class": {
            "Name": "ArrowTokenizer",
            "Package": "Lexer/Arrow",
            "Propaties": [],
            "Parents": [],
            "Interfaces": [
                {
                    "interface": {
                        "Name": "Tokenizeable",
                        "Package": "Lexer",
                        "Propaties": [],
                        "Parents": [],
                        "Interfaces": []
                    }
                }
            ]
        }
    },
    {
        "class": {
            "Name": "LeftArrowTokenizer",
            "Package": "Lexer/Arrow",
            "Propaties": [
                {
                    "name": "publicProperty",
                    "visibility": "public"
                },
                {
                    "name": "protectedProperty",
                    "visibility": "protected"
                },
                {
                    "name": "privateProperty",
                    "visibility": "private"
                }
            ],
            "Parents": [
                {
                    "abstract class": {
                        "Name": "ArrowTokenizer",
                        "Package": "Lexer/Arrow",
                        "Propaties": [],
                        "Parents": [],
                        "Interfaces": [
                            {
                                "interface": {
                                    "Name": "Tokenizeable",
                                    "Package": "Lexer",
                                    "Propaties": [],
                                    "Parents": [],
                                    "Interfaces": []
                                }
                            }
                        ]
                    }
                }
            ],
            "Interfaces": []
        }
    },
    {
        "class": {
            "Name": "NoneDefinitionClass",
            "Package": "Lexer",
            "Propaties": [],
            "Parents": [],
            "Interfaces": [
                {
                    "interface": {
                        "Name": "Tokenizeable",
                        "Package": "Lexer",
                        "Propaties": [],
                        "Parents": [],
                        "Interfaces": []
                    }
                }
            ]
        }
    }
]
dump $ast->toArray()
array(4) {
  [0]=>
  array(1) {
    ["interface"]=>
    array(5) {
      ["Name"]=>
      string(12) "Tokenizeable"
      ["Package"]=>
      string(5) "Lexer"
      ["Propaties"]=>
      array(0) {
      }
      ["Parents"]=>
      array(0) {
      }
      ["Interfaces"]=>
      array(0) {
      }
    }
  }
  [1]=>
  array(1) {
    ["abstract class"]=>
    array(5) {
      ["Name"]=>
      string(14) "ArrowTokenizer"
      ["Package"]=>
      string(11) "Lexer/Arrow"
      ["Propaties"]=>
      array(0) {
      }
      ["Parents"]=>
      array(0) {
      }
      ["Interfaces"]=>
      array(1) {
        [0]=>
        array(1) {
          ["interface"]=>
          array(5) {
            ["Name"]=>
            string(12) "Tokenizeable"
            ["Package"]=>
            string(5) "Lexer"
            ["Propaties"]=>
            array(0) {
            }
            ["Parents"]=>
            array(0) {
            }
            ["Interfaces"]=>
            array(0) {
            }
          }
        }
      }
    }
  }
  [2]=>
  array(1) {
    ["class"]=>
    array(5) {
      ["Name"]=>
      string(18) "LeftArrowTokenizer"
      ["Package"]=>
      string(11) "Lexer/Arrow"
      ["Propaties"]=>
      array(3) {
        [0]=>
        array(2) {
          ["name"]=>
          string(14) "publicProperty"
          ["visibility"]=>
          string(6) "public"
        }
        [1]=>
        array(2) {
          ["name"]=>
          string(17) "protectedProperty"
          ["visibility"]=>
          string(9) "protected"
        }
        [2]=>
        array(2) {
          ["name"]=>
          string(15) "privateProperty"
          ["visibility"]=>
          string(7) "private"
        }
      }
      ["Parents"]=>
      array(1) {
        [0]=>
        array(1) {
          ["abstract class"]=>
          array(5) {
            ["Name"]=>
            string(14) "ArrowTokenizer"
            ["Package"]=>
            string(11) "Lexer/Arrow"
            ["Propaties"]=>
            array(0) {
            }
            ["Parents"]=>
            array(0) {
            }
            ["Interfaces"]=>
            array(1) {
              [0]=>
              array(1) {
                ["interface"]=>
                array(5) {
                  ["Name"]=>
                  string(12) "Tokenizeable"
                  ["Package"]=>
                  string(5) "Lexer"
                  ["Propaties"]=>
                  array(0) {
                  }
                  ["Parents"]=>
                  array(0) {
                  }
                  ["Interfaces"]=>
                  array(0) {
                  }
                }
              }
            }
          }
        }
      }
      ["Interfaces"]=>
      array(0) {
      }
    }
  }
  [3]=>
  array(1) {
    ["class"]=>
    array(5) {
      ["Name"]=>
      string(19) "NoneDefinitionClass"
      ["Package"]=>
      string(5) "Lexer"
      ["Propaties"]=>
      array(0) {
      }
      ["Parents"]=>
      array(0) {
      }
      ["Interfaces"]=>
      array(1) {
        [0]=>
        array(1) {
          ["interface"]=>
          array(5) {
            ["Name"]=>
            string(12) "Tokenizeable"
            ["Package"]=>
            string(5) "Lexer"
            ["Propaties"]=>
            array(0) {
            }
            ["Parents"]=>
            array(0) {
            }
            ["Interfaces"]=>
            array(0) {
            }
          }
        }
      }
    }
  }
}

License

The MIT License (MIT). Please see LICENSE for more information.

You might also like...
A New Markdown parser for PHP5.4

Ciconia - A New Markdown Parser for PHP The Markdown parser for PHP5.4, it is fully extensible. Ciconia is the collection of extension, so you can rep

A lightweight lexical string parser for BBCode styled markup.

Decoda A lightweight lexical string parser for BBCode styled markup. Requirements PHP 5.6.0+ Multibyte Composer Contributors "Marten-Plain" emoticons

Simple URL parser

urlparser Simple URL parser This is a simple URL parser, which returns an array of results from url of kind /module/controller/param1:value/param2:val

This is a simple, streaming parser for processing large JSON documents

Streaming JSON parser for PHP This is a simple, streaming parser for processing large JSON documents. Use it for parsing very large JSON documents to

UpToDocs scans a Markdown file for PHP code blocks, and executes each one in a separate process.

UpToDocs UpToDocs scans a Markdown file for PHP code blocks, and executes each one in a separate process. Include this in your CI workflows, to make s

This is a simple php project to help a friend how parse a xml file.

xml-parser-with-laravie Requirements PHP 7.4+ Composer 2+ How to to setup to test? This is very simple, just follow this commands git clone https://gi

Plug and play flat file markdown blog for your Laravel-projects
Plug and play flat file markdown blog for your Laravel-projects

Ampersand Plug-and-play flat file markdown blog tool for your Laravel-project. Create an article or blog-section on your site without the hassle of se

Convert HTML to Markdown with PHP

HTML To Markdown for PHP Library which converts HTML to Markdown for your sanity and convenience. Requires: PHP 7.2+ Lead Developer: @colinodell Origi

A simple PHP library for handling Emoji

Emoji Emoji images from unicode characters and names (i.e. :sunrise:). Built to work with Twemoji images. use HeyUpdate\Emoji\Emoji; use HeyUpdate\Emo

Comments
  • Capture complete property line

    Capture complete property line

    I played around with the latest update and found an issue. This changes behaviour so it might be a breaking change. Maybe this is ok as a bug fix. You should determine whether this is a bugfix or not, but it helps me to work further:

    Again my testing payload is

    @startuml
    
    hide empty methods
    
    package Heptacom\HeptaConnect\Playground\Dataset {
        class Cap {
            + type : CapType
        }
    }
    @enduml
    

    Before my change the property is interpreted as:

    image

    With my change it looks like this

    image

    It allows property lines to be captured completely when whitespace separates multiple entries. Without this change the property is not completely covered. I assume it will result in maybe some unwanted results because I am merging multiple values by a single space, which is not the exact whitespace value that has been originally be skipped by the tokenizing. I also assume that when you follow the plantuml guides to change the class body that the group separators could now be part of a property (see here their example).

    Please add your feedback how to process my finding.

    opened by JoshuaBehrens 4
  • Missing properties/fields

    Missing properties/fields

    I really like to write a code generator with this library but your parser does not read the fields yet with it :/ With your package it would be possible to use php only to work with plantuml files. I watch this repo and maybe I can switch from xmi to this package :)

    opened by JoshuaBehrens 3
  • Parsing of small plantuml fails on (presumably) missing visibility

    Parsing of small plantuml fails on (presumably) missing visibility

    I was looking into the parsing abilities of your changes in 2.1. It looks quite promising to me. It failed though to parse this file:

    @startuml
    
    hide empty methods
    
    package Heptacom\HeptaConnect\Playground\Dataset {
        class Cap {
            + type : CapType
        }
    }
    @enduml
    

    It fails with this stacktrace:

    InvalidArgumentException: 
    
    vendor/puml2php/puml-parser/src/Lexer/Token/Tokens.php:36
    vendor/puml2php/puml-parser/src/Lexer/Token/Tokens.php:50
    vendor/puml2php/puml-parser/src/Parser/Parser.php:149
    vendor/puml2php/puml-parser/src/Parser/Parser.php:73
    vendor/puml2php/puml-parser/src/Parser/Parser.php:114
    vendor/puml2php/puml-parser/src/Parser/Parser.php:68
    vendor/puml2php/puml-parser/src/Parser/Parser.php:50
    

    I can follow the stacktrace but the exception is not helpful to understand where the tokenizing begins to fail. So the best would be to solve these two steps:

    1. have a understandable error message. There are already exceptions that can tell exact positions in uml code (I saw some related to #10 ) . Maybe we can get this in there as well
    2. understand why the above plantuml code fails and fix either the incoming uml code or tokenizer
    opened by JoshuaBehrens 2
Releases(v3.2.1)
Owner
Tasuku Yamashita
Tasuku Yamashita
Better Markdown Parser in PHP

Parsedown Better Markdown Parser in PHP - Demo. Features One File No Dependencies Super Fast Extensible GitHub flavored Tested in 5.3 to 7.3 Markdown

Emanuil Rusev 14.3k Jan 8, 2023
Highly-extensible PHP Markdown parser which fully supports the CommonMark and GFM specs.

league/commonmark league/commonmark is a highly-extensible PHP Markdown parser created by Colin O'Dell which supports the full CommonMark spec and Git

The League of Extraordinary Packages 2.4k Jan 1, 2023
A super fast, highly extensible markdown parser for PHP

A super fast, highly extensible markdown parser for PHP What is this? A set of PHP classes, each representing a Markdown flavor, and a command line to

Carsten Brandt 989 Dec 16, 2022
An HTML5 parser and serializer for PHP.

HTML5-PHP HTML5 is a standards-compliant HTML5 parser and writer written entirely in PHP. It is stable and used in many production websites, and has w

null 1.2k Dec 31, 2022
📜 Modern Simple HTML DOM Parser for PHP

?? Simple Html Dom Parser for PHP A HTML DOM parser written in PHP - let you manipulate HTML in a very easy way! This is a fork of PHP Simple HTML DOM

Lars Moelleken 665 Jan 4, 2023
Advanced shortcode (BBCode) parser and engine for PHP

Shortcode Shortcode is a framework agnostic PHP library allowing to find, extract and process text fragments called "shortcodes" or "BBCodes". Example

Tomasz Kowalczyk 358 Nov 26, 2022
Parsica - PHP Parser Combinators - The easiest way to build robust parsers.

Parsica The easiest way to build robust parsers in PHP.

null 0 Feb 22, 2022
Efficient, easy-to-use, and fast PHP JSON stream parser

JSON Machine Very easy to use and memory efficient drop-in replacement for inefficient iteration of big JSON files or streams for PHP 5.6+. See TL;DR.

Filip Halaxa 801 Dec 28, 2022
A PHP hold'em range parser

mattjmattj/holdem-range-parser A PHP hold'em range parser Installation No published package yet, so you'll have to clone the project manually, or add

Matthias Jouan 1 Feb 2, 2022
Parser for Markdown and Markdown Extra derived from the original Markdown.pl by John Gruber.

PHP Markdown PHP Markdown Lib 1.9.0 - 1 Dec 2019 by Michel Fortin https://michelf.ca/ based on Markdown by John Gruber https://daringfireball.net/ Int

Michel Fortin 3.3k Jan 1, 2023