A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.

Overview

URLify for PHP Build Status

A fast PHP slug generator and transliteration library, started as a PHP port of URLify.js from the Django project.

Handles symbols from latin languages, Arabic, Azerbaijani, Bulgarian, Burmese, Croatian, Czech, Danish, Esperanto, Estonian, Finnish, French, Switzerland (French), Austrian (French), Georgian, German, Switzerland (German), Austrian (German), Greek, Hindi, Kazakh, Latvian, Lithuanian, Norwegian, Persian, Polish, Romanian, Russian, Swedish, Serbian, Slovak, Turkish, Ukrainian and Vietnamese, and many other via ASCII::to_transliterate().

Symbols it cannot transliterate it can omit or replace with a specified character.

Installation

Install the latest version with:

$ composer require jbroadway/urlify

Usage

First, include Composer's autoloader:

require_once 'vendor/autoload.php';

To generate slugs for URLs:

<?php

echo URLify::slug (' J\'étudie le français ');
// "jetudie-le-francais"

echo URLify::slug ('Lo siento, no hablo español.');
// "lo-siento-no-hablo-espanol"

To generate slugs for file names:

<?php

echo URLify::filter ('фото.jpg', 60, "", true);
// "foto.jpg"

To simply transliterate characters:

<?php

echo URLify::downcode ('J\'étudie le français');
// "J'etudie le francais"

echo URLify::downcode ('Lo siento, no hablo español.');
// "Lo siento, no hablo espanol."

/* Or use transliterate() alias: */

echo URLify::transliterate ('Lo siento, no hablo español.');
// "Lo siento, no hablo espanol."

To extend the character list:

<?php

URLify::add_chars ([
	'¿' => '?', '®' => '(r)', '¼' => '1/4',
	'½' => '1/2', '¾' => '3/4', '¶' => 'P'
]);

echo URLify::downcode ('¿ ® ¼ ¼ ¾ ¶');
// "? (r) 1/2 1/2 3/4 P"

To extend the list of words to remove:

<?php

URLify::remove_words (['remove', 'these', 'too']);

To prioritize a certain language map:

<?php

echo URLify::filter ('Ägypten und Österreich besitzen wie üblich ein Übermaß an ähnlich öligen Attachés', 60, 'de');
// "aegypten-und-oesterreich-besitzen-wie-ueblich-ein-uebermass-aehnlich-oeligen-attaches"

echo URLify::filter ('Cağaloğlu, çalıştığı, müjde, lazım, mahkûm', 60, 'tr');
// "cagaloglu-calistigi-mujde-lazim-mahkum"

Please note that the "ü" is transliterated to "ue" in the first case, whereas it results in a simple "u" in the latter.

Comments
  • Not compatable with Laravel 9

    Not compatable with Laravel 9

    Since Laravel 9 is requiring voku/portable-ascii:^2.0 and this repo is requiring voku/portable-ascii:^1.4 it causes a conflict when trying to update composer.

    opened by emedchill 7
  • Use classmap instead.

    Use classmap instead.

    The verdict is out our cool feature was just too cool =)

    They suggest we use classmap instead.

    Included the URLify.php as a classmap for autoloader instead of psr-0.

    opened by nickl- 6
  • Fix URLify::init() when called with some language

    Fix URLify::init() when called with some language

    If $language is not one for which there's a key in the $maps array, the $chars is not reset and the regular expression becomes longer and longer every time init() is called

    opened by mlocati 4
  • PSR-0 compliance and other goodies

    PSR-0 compliance and other goodies

    I was rather sceptical at first not wanting to over complicate this simple class with namespace and 30 levels deep library/src/package folders I just bit my lip and tried:

            "autoload": {
                    "psr-0": {
                            "": ""
                    }
            }
    

    and it worked =) so "" is PSR-0 capable and we have all the nyummy goodness of autoloading.

    Added an INSTALL file to explain installation. Added bootstrap.php to strap the vendor/autoloader Added phpunit.xml Removed the require_once from the test

    In test folder just run phpunit no arguments if bootstrap can't strap it will display the INSTALL file else the test will run as if no one cares.

    Enjoy!

    opened by nickl- 4
  • Fix Issue #55

    Fix Issue #55

    This change fixes a language exception that occurs when a character that is used as a regular expression delimiter ("/" by default) is included as a key in the array argument passed into the add_chars() method, and then downcode() is called.

    Note that the existing tests do not cover this scenario.

    opened by cbj4074 2
  • Missing A char

    Missing A char

    Hi, I found a strange bug, look at the below code (local ENV: php 5.6 on mac os, dev-prod ENV: php 5.6 on ubuntu 16):

    • var_dump(\URLify::filter('Text sample A')); // text-sample
    • var_dump(\URLify::filter('Text sample B')); // text-sample-b
    • var_dump(\URLify::filter('Text sample AA')); // text-sample-aa

    Where is, in the first var_dump, the last "a" char?

    Is this package still maintained?

    opened by marlenesco 2
  • Make usage of remove_list optional

    Make usage of remove_list optional

    Hi,

    currently, the removal of words can only be influenced by setting the public static $remove_list property (as seen in #35).

    When using URLify in multiple places in a project, this has to be multiple times which seems error-prone. Also, using the remove_list feature in some calls to URLify::filter while disabling it in other calls isn't possible currently.

    This pull request adds an additional parameter to the URLify::filter() method to toggle the usage of the remove list feature.

    Thx! :)

    opened by mkraemer 2
  • Unable to urlify properly

    Unable to urlify properly

    Hi there,

    I've been trying to urlify a very simple string but the last part is being dropped. It's probably a wanted behaviour but it could be useful if there may be an option to avoid that.

    My string is "Brazilian Série A" and I want it to become "brazilian-serie-a". It becomes "brazilian-serie" instead without the final "-a" part. Any way I can do this?

    Below my code:

    \URLify::filter('Brazilian Série A') // produces "brazilian-serie"
    

    Tried also with:

    \URLify::filter('Brazilian Série A', 120, 'en') // produces "brazilian-serie"
    
    opened by fracasula 2
  • Added the possibility to priorize urlify language maps

    Added the possibility to priorize urlify language maps

    This is useful if languages have different rules for the same character. e.g. German: "ü"=>"ue" Turkish "ü" => "u"

    opened by patrickheck 2
  • [+]: use

    [+]: use "voku/portable-ascii"

    reference: https://github.com/jbroadway/urlify/issues/51

    Here I added "voku/portable-ascii" which is for example used in the "dev-master" version of laravel and it's also based on this project. :smile:


    I know it's not that easy to pick a minimal php version to support, but most systems has already upgraded to > 7.0 (https://blog.packagist.com/php-versions-stats-2019-1-edition/) so maybe it's time to move forward?

    Linux Distro | Version | End of Life | Default PHP -- | -- | -- | -- | Ubuntu | 14.04 (Trusty) | April 2019 (EOL) | 5.5.9 |   |   Ubuntu | 16.04 (Xenial) | April 2024 | 7.0 |   |   Ubuntu | 18.04 (Bionic) | April 2028 | 7.2 |   |   Debian | 8 (Jessie) | June 30, 2020 | 5.6.29 |   |   Debian | 9 (Stretch) | ~2022 | 7.0 |   |   Fedora | 29 | October 30, 2018 | 7.2.10 |   |   Fedora | 30 | April 30, 2019 | 7.3.4 |   |   OpenSUSE | Leap 15.1 | November 22, 2020 | 7.2.5 |   |   CentOS | 6 | November 30, 2020 | 5.3.3 |   |   CentOS | 7 | June 30, 2024 | 5.4.16 |   |   RHEL | 6 | November 30, 2020 | 5.3.3 |   |   RHEL | 7 | June 30, 2024 | 5.4.16 |   |   RHEL | 8 | May 2029 | 7.2 |   |   OEL | 6 | March 2021 | 7.0 (min) |   |   OEL | 7 | July 2024 | 7.0 (min) |   |  


    PS: in my fork I also added different "stop-words" (https://github.com/voku/stop-words/tree/master/src/voku/helper/stopwords) for different languages and some other specials like support for currencies (https://github.com/voku/urlify/blob/master/src/voku/helper/URLify.php#L484). I don't know if this is also interesting for you?


    This change is Reviewable

    opened by voku 1
  • Passing certain characters to add_chars() method causes

    Passing certain characters to add_chars() method causes "preg_match_all(): Unknown modifier ']'"

    Consider the following:

    URLify::add_chars(['/' => '']);
    

    This causes a language exception, preg_match_all(): Unknown modifier ']', because the / character is used as the regular expression delimiter within the URLify library.

    The above example derives from a fairly common and reasonable use-case: I want to remove all illegal characters from a file name, and on UNIX and Windows, / is illegal.

    To fix this, PHP's preg_quote() function must be called on the keys in the array argument passed to add_chars().

    I'll submit a PR shortly that seeks to fix the issue.

    opened by cbj4074 1
  • 1.2.4 changed transliteration behaviour

    1.2.4 changed transliteration behaviour

    Upgrading from 1.2.3 to 1.2.4 broke our test suite, in particular some characters are transliterated differently, breaking assertions and semver.

    E.g. we test that това е текст на бълрагски за тест becomes tova-e-tekst-na-blragski-za-test which is true in 1.2.3 and false in 1.2.4.

    In 1.2.4 it instead transliterates to tova-e-tekst-na-bielragski-za-test.

    | urlify version | in | out | |----------------|-----------|------------| | 1.2.3 | бълрагски | blragski | | 1.2.4 | бълрагски | bielragski |

    I'm sure the dependency has its reasons for doing this, but composer pulled in 1.2.4 automatically and broke out test suites, this should have been a 1.3.0 or a 2.0.0 release.

    opened by tomjn 2
  • Why is $underscoreToSpace removed ?

    Why is $underscoreToSpace removed ?

    Hi,

    Why is $underscoreToSpace removed from the filter ? It was pretty handy to make underscores hypens of you wanted, or spaces ofcourse.

    I hope there is a good reason for it!

    Thanks

    opened by Yamakasi 4
  • Add new param $trim_under_score

    Add new param $trim_under_score

    It will fix the issue when trimming a text:

    From : -test- to test

    With this option the "-" won't be removed.

    By default it works as usual.

    opened by jmontoyaa 0
  • Support more characters by default

    Support more characters by default

    Had to add the following chars for our transliteration test to pass:

            URLify::add_chars(
                array(
                    'Ÿ' => 'Y',
                    'µ' => 'u',
                    '¥' => 'Y',
                    'Ĉ' => 'C',
                    'ĉ' => 'c',
                    'Ċ' => 'C',
                    'ċ' => 'c',
                    'Ĝ' => 'G',
                    'ĝ' => 'g',
                    'Ġ' => 'G',
                    'ġ' => 'g',
                    'Ĥ' => 'H',
                    'ĥ' => 'h',
                    'Ħ' => 'H',
                    'ħ' => 'h',
                    'Ĕ' => 'E',
                    'ĕ' => 'e',
                    'Ĭ' => 'I',
                    'ĭ' => 'i',
                    'Ĵ' => 'J',
                    'ĵ' => 'j',
                    'Ĺ' => 'L',
                    'ĺ' => 'l',
                    'Ľ' => 'L',
                    'ľ' => 'l',
                    'Ŀ' => 'L',
                    'ŀ' => 'l',
                    'ʼn' => 'n',
                    'Ō' => 'O',
                    'ō' => 'o',
                    'Ŏ' => 'O',
                    'ŏ' => 'o',
                    'Ŕ' => 'R',
                    'ŕ' => 'r',
                    'Ŗ' => 'R',
                    'ŗ' => 'r',
                    'Ŝ' => 'S',
                    'ŝ' => 's',
                    'Ŧ' => 'T',
                    'ŧ' => 't',
                    'Ŭ' => 'U',
                    'ŭ' => 'u',
                    'Ŵ' => 'W',
                    'ŵ' => 'w',
                    'Ŷ' => 'Y',
                    'ŷ' => 'y',
                    'ſ' => 'i',
                    'ƒ' => 'f',
                    'O' => 'O',
                    'o' => 'o',
                    'U' => 'U',
                    'u' => 'u',
                    'Ǎ' => 'A',
                    'ǎ' => 'a',
                    'Ǐ' => 'I',
                    'ǐ' => 'i',
                    'Ǒ' => 'O',
                    'ǒ' => 'o',
                    'Ǔ' => 'U',
                    'ǔ' => 'u',
                    'Ǖ' => 'U',
                    'ǖ' => 'u',
                    'Ǘ' => 'U',
                    'ǘ' => 'u',
                    'Ǚ' => 'U',
                    'ǚ' => 'u',
                    'Ǜ' => 'U',
                    'ǜ' => 'u',
                    'Ǻ' => 'A',
                    'ǻ' => 'a',
                    'Ǿ' => 'O',
                    'ǿ' => 'o',
                    'Ǽ' => 'Ae',
                    'ǽ' => 'ae',
                    'IJ' => 'IJ',
                    'ij' => 'ij',
                    'J' => 'J',
                    'ĸ' => 'k',
                    'Ŋ' => 'N',
                    'ŋ' => 'n',
                    'Ẁ' => 'W',
                    'ẁ' => 'w',
                    'Ẃ' => 'W',
                    'ẃ' => 'w',
                    'Ẅ' => 'W',
                    'ẅ' => 'w',
                )
            );
    

    Unfortunately, since I do not know what language they belong to, I find it difficult to provide a PR when the code is structured based on language.

    opened by motin 1
Releases(1.2.4-stable)
  • 1.2.4-stable(Jun 15, 2022)

  • 1.2.3-stable(Jan 18, 2022)

    • Migrated CI from travis-ci to GitHub Actions
    • Updated test fixtures
    • Updated composer description and dependency versions
    • Added badges to readme
    Source code(tar.gz)
    Source code(zip)
  • 1.2.2-stable(Jun 14, 2020)

  • 1.2.1-stable(Jun 2, 2020)

    • Fixed tests broken from changes in voku/portable-ascii
    • Strip additional dev files from releases - thanks @Tobion!
    • Now requires PHP 7.2+ to match PHPUnit
    • Fixed missing autoloader include in command line scripts
    • Readme updates
    Source code(tar.gz)
    Source code(zip)
  • 1.2.0-stable(Dec 13, 2019)

    • Using voku/portable-ascii performance optimized ascii string function library
    • Stop word support for multiple languages (disabled by default)
    • Currency symbol support
    • Support for more unicode characters
    • Removed support for PHP versions before 7.0

    Thanks to @voku for the improvements!

    Source code(tar.gz)
    Source code(zip)
  • 1.1.3-stable(Jun 27, 2019)

    • Fixed issue with / character being added via add_chars()
    • Fixed Vietnamese language code
    • Fixed potential duplicate word issue
    • Removed HHVM from testing and added newer PHP versions to automated tests

    Thanks to @pincombe, @scorp13, and @cbj4074 for the fixes!

    Source code(tar.gz)
    Source code(zip)
  • 1.1.2-stable(Dec 8, 2018)

  • 1.1.1-stable(Aug 28, 2018)

  • 1.1.0-stable(Jan 3, 2017)

  • 1.0.9-stable(Sep 14, 2016)

  • 1.0.8-stable(Jul 27, 2016)

    This release adds two new options to filter():

    1. $lower_case specifies whether you want to convert to lower case (the default), or preserve the existing case of the text.
    2. $treat_underscore_as_space specifies whether you want to convert underscores to spaces (the default), or preserve underscores in the output.

    Thanks @ywarnier and @jmontoyaa for these additions!

    Source code(tar.gz)
    Source code(zip)
  • 1.0.7-stable(Dec 7, 2015)

  • 1.0.6-stable(Oct 15, 2015)

    Bulgarian characters added, new CLI scripts (downcode, filter, transliterate), fix for UTF-8 spaces. Thanks to @skyosev, @shefi, @rinogo, and @karptonite for these!

    Source code(tar.gz)
    Source code(zip)
  • 1.0.5-stable(May 29, 2015)

  • 1.0.4-stable(Mar 9, 2015)

  • 1.0.3-stable(Mar 17, 2014)

  • 1.0.2-stable(Feb 5, 2014)

  • 1.0.1-stable(Oct 16, 2013)

Owner
Aband*nthecar
Full-stack developer. One-man synthpop band. CTO/Co-Founder @ HeyAlfa + Flipside XR.
Aband*nthecar
Converts a string to a slug. Includes integrations for Symfony, Silex, Laravel, Zend Framework 2, Twig, Nette and Latte.

cocur/slugify Converts a string into a slug. Developed by Florian Eckerstorfer in Vienna, Europe with the help of many great contributors. Features Re

Cocur 2.8k Dec 22, 2022
A PHP class which allows the decoding and encoding of a wider variety of characters compared to the standard htmlentities and html_entity_decode functions.

The ability to encode and decode a certain set of characters called 'Html Entities' has existed since PHP4. Amongst the vast number of functions built into PHP, there are 4 nearly identical functions that are used to encode and decode html entities; despite their similarities, however, 2 of them do provide additional capabilities not available to the others.

Gavin G Gordon (Markowski) 2 Nov 12, 2022
only 5 characters to rce

phpfuck-6characters @Y4tacker Description: only 6 characters to rce ( ) ^ 9 . ; Useage php 6character-rce.php system(\"whoami\"); (((((99999999999999

Y4tacker 12 Oct 4, 2022
PHP library to parse urls from string input

Url highlight - PHP library to parse URLs from string input. Works with complex URLs, edge cases and encoded input. Features: Replace URLs in string b

Volodymyr Stelmakh 77 Sep 16, 2022
php-crossplane - Reliable and fast NGINX configuration file parser and builder

php-crossplane Reliable and fast NGINX configuration file parser and builder ℹ️ This is a PHP port of the Nginx Python crossplane package which can be

null 19 Jun 30, 2022
Library for free use Google Translator. With attempts connecting on failure and array support.

GoogleTranslateForFree Packagist: https://packagist.org/packages/dejurin/php-google-translate-for-free Library for free use Google Translator. With at

Yurii De 122 Dec 23, 2022
Generate Heroku-like random names to use in your php applications.

HaikunatorPHP Generate Heroku-like random names to use in your PHP applications. Installation composer require atrox/haikunator Usage Haikunator is p

Atrox 99 Jul 19, 2022
A PHP string manipulation library with multibyte support. Compatible with PHP 5.4+, PHP 7+, and HHVM.

A PHP string manipulation library with multibyte support. Compatible with PHP 5.4+, PHP 7+, and HHVM. s('string')->toTitleCase()->ensureRight('y') ==

Daniel St. Jules 2.5k Dec 28, 2022
PHP library to detect and manipulate indentation of strings and files

indentation PHP library to detect and manipulate the indentation of files and strings Installation composer require --dev colinodell/indentation Usage

Colin O'Dell 34 Nov 28, 2022
The Universal Device Detection library will parse any User Agent and detect the browser, operating system, device used (desktop, tablet, mobile, tv, cars, console, etc.), brand and model.

DeviceDetector Code Status Description The Universal Device Detection library that parses User Agents and detects devices (desktop, tablet, mobile, tv

Matomo Analytics 2.4k Jan 5, 2023
ColorJizz is a PHP library for manipulating and converting colors.

#Getting started: ColorJizz-PHP uses the PSR-0 standards for namespaces, so there should be no trouble using with frameworks like Symfony 2. ###Autolo

Mikeemoo 281 Nov 25, 2022
A PHP library for generating universally unique identifiers (UUIDs).

ramsey/uuid A PHP library for generating and working with UUIDs. ramsey/uuid is a PHP library for generating and working with universally unique ident

Ben Ramsey 11.9k Jan 8, 2023
A PHP string manipulation library with multibyte support

A PHP string manipulation library with multibyte support. Compatible with PHP 5.4+, PHP 7+, and HHVM. s('string')->toTitleCase()->ensureRight('y') ==

Daniel St. Jules 2.5k Jan 3, 2023
🉑 Portable UTF-8 library - performance optimized (unicode) string functions for php.

?? Portable UTF-8 Description It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your

Lars Moelleken 474 Dec 22, 2022
:accept: Stringy - A PHP string manipulation library with multibyte support, performance optimized

?? Stringy A PHP string manipulation library with multibyte support. Compatible with PHP 7+ 100% compatible with the original "Stringy" library, but t

Lars Moelleken 144 Dec 12, 2022
A language detection library for PHP. Detects the language from a given text string.

language-detection Build Status Code Coverage Version Total Downloads Minimum PHP Version License This library can detect the language of a given text

Patrick Schur 738 Dec 28, 2022
Text - Simple 1 Class Text Manipulation Library

Text - Simple 1 Class Text Manipulation Library Do you remember PHP's string functions? If not, just wrap you text with Text! It will save a minute on

Kazuyuki Hayashi 51 Nov 16, 2021
The Hoa\Ustring library.

Hoa is a modular, extensible and structured set of PHP libraries. Moreover, Hoa aims at being a bridge between industrial and research worlds. Hoa\Ust

Hoa 402 Jan 4, 2023
PCRE wrapping library that offers type-safe preg_* replacements.

composer/pcre PCRE wrapping library that offers type-safe preg_* replacements. If you are using a modern PHP version you are probably better off using

Composer 308 Dec 30, 2022