Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

  • Amir Khan 860 posts 1509 karma points
    Mar 31, 2010 @ 01:05
    Amir Khan

    Can't insert em dashes

    I can't insert em dashes in the TinyMCE editor, it appears as an em dash until I publish, then it converts to a hyphen. Any ideas?


    Thank you!

  • mfeola 117 posts 221 karma points
    Jun 17, 2010 @ 20:05

    i have the same issue.  I saw this post that might be helpful, but i don't know if i want to turn off the tidy code option because i cant see the admin trying to copy and paste code from  a word document or something.

  • Alexander Bryukhov 19 posts 68 karma points c-trib
    Jul 18, 2010 @ 00:54
    Alexander Bryukhov

    Hi all!

    When i realized the same behaviour of u4.0.x, i start a little investigation and internet crawling... The same problem was reported in old umbraco forum in 2009 and it seems still no solution except turning Tidy off...

    So, it was time for me to download the Tidy.NET sources from SF and start to debug ;-)

    The results are:

    1) The dashes (and some quotation marks) replacement occurs inside the tidy's PerfectPrint part, the code snippet is:

       if (_options.MakeClean) {
        if (c >= 0x2013 && c <= 0x201E) {
         switch (c) {
         case 0x2013:
         case 0x2014:
          c = '-';
         case 0x2018:
         case 0x2019:
         case 0x201A:
          c = '\'';
         case 0x201C:
         case 0x201D:
         case 0x201E:
          c = '"';

    2) There are two possible solutions as i can see:

      a) turn off the MakeClean option in the umbraco xhtml helper code (in the Tidy initialization part) - it will need to recompile umbraco cms dll

      b) because of "MakeClean" usage in other parts of Tidy and maybe unexpected result of this code change, i made a little change in the Tidy's code itself:

    if (_options.MakeClean) {

    into the

    if (_options.MakeClean && _options.CharEncoding != CharEncoding.Raw && _options.CharEncoding != CharEncoding.UTF8)

    3) So, with this little update the Tidy will made NO any "dash-changes" in your editor content if the "config\umbracoSettings.config" TidyCharEncoding option state is default ("UTF8") or "Raw".

    4) There is another sad moment connected to this behavior, it's about TinyMCE paste plugin and his replacement default behavior. When you'll try to insert some piece of content with UTF8 entities via the ctrl-V in paste plugin - the TinyMCE will replace very strange ;-) set of entities with the ASCII analogs. Luckily, this behavior can be very simply changed by adding into the "config\tinyMceConfig.config" file the two fllowing keys:

        <config key="entity_encoding">raw</config>
        <config key="paste_replace_list">,</config>


    --> To all guru's: what is the best way to place somewhere on this community site the updated TidyNet.dll file?


  • Heather Floyd 435 posts 512 karma points c-trib
    Oct 09, 2010 @ 00:48
    Heather Floyd


    You could make it a "package" and post it under "Projects", or you can just upload the dll to your own server and paste a download link here.


  • Doug Mackay 56 posts 155 karma points
    Jan 23, 2011 @ 17:32
    Doug Mackay

    If anyone else comes across this problem I've created a package with a new TidyNet.dll based on Alexander's post above.

    Package is here

  • Alexander Bryukhov 19 posts 68 karma points c-trib
    Jan 23, 2011 @ 18:24
    Alexander Bryukhov

    Thanks, Doug!

  • Stephen 717 posts 2039 karma points hq
    Feb 16, 2012 @ 22:13

    @Doug - just discovered that one today. Saved my day!

  • Allan James 20 posts 40 karma points
    Feb 24, 2012 @ 19:36
    Allan James

    Thanks Doug.... this was exactly what I needed.

    Who made the decision to replace em-dash with '-' what would possibly be the reasoning behind this?


  • Funka! 398 posts 659 karma points
    Apr 28, 2012 @ 04:30

    I've also noticed that non-breaking spaces (&nbsp;) get stripped and replaced with regular spaces. This has unfortunate consequences in those rare (but important!) cases where these are truly needed... (One more-common example are in in empty table cells when dealing with really old sites.)

    I'm definitely going to try Doug's package because the em-dashes have also been a major pain for us: they are an important typographical/grammatical character which is definitely not the same as a hyphen! I just found this post now, so thought I'd also mention the non-breaking space problem for posterity...



  • Funka! 398 posts 659 karma points
    May 07, 2012 @ 19:53

    Our of curiosity, I downloaded the Tidy.NET source from sourceforge and found the same spot in the code that Alexander noted along with some interesting commentary. (Still trying to see where perhaps our precious non-breaking spaces are being stripped.)

    To respond to Allan's question,

    Who made the decision to replace em-dash with '-' what would possibly be the reasoning behind this?

    .... Here's what it says about this. (Keep in mind, the Tidy.NET project was last updated in June of 2005! And this code section in particular seems to have been last updated back in 2000??? So perhaps a bit of antiquity that is doing more harm than good now...)

    Filters from Word and PowerPoint often use smart
    quotes resulting in character codes between 128
    and 159. Unfortunately, the corresponding HTML 4.0
    entities for these are not widely supported. The
    following converts dashes and quotation marks to
    the nearest ASCII equivalent. My thanks to
    Andrzej Novosiolov for his help with this code.

    This has been an interesting learning experience to look into the Tidy.NET project a bit and realize the magnitude of all the tidying going-on behind the scenes!

Please Sign in or register to post replies

Write your reply to: