• orcrist@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    I’m not a professional code monkey although I’ve done a fair amount of coding, and every time I tried to do parsing myself, I later regretted it.

    But telling people that they’re doing it wrong is rarely met with positivity. :-)

  • schnurrito@discuss.tchncs.de
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    no, this is one of the worst answers on Stack Overflow

    OP had a specific question to capture opening tags. The thing OP asked about can be done with regular expressions. It is true that arbitrarily nested languages like HTML cannot generally be parsed with regular expressions, but that is not what OP asked about.

    • moriquende@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      It can’t be done, as an opening tag in html can contain anything in its attributes, even JavaScript (e.g. onclick handler).

        • moriquende@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          You can’t parse every html opening tag with regex, because a html opening tag doesn’t have a set structure. How would you match, with regex, this opening tag? <mytag myattribute="<value of \"myattribute\">" >

          • schnurrito@discuss.tchncs.de
            link
            fedilink
            arrow-up
            0
            ·
            edit-2
            3 months ago

            Is this valid HTML? My understanding is that that attribute value needs to be escaped, i.e. &lt;value of \&quot;myattribute\&quot;&gt;.

            • moriquende@lemmy.world
              link
              fedilink
              arrow-up
              0
              ·
              3 months ago

              The quote must not be escaped when you start with a single quote. The rest doesn’t. This is valid and tested: <img alt='my "<img>"'>

    • fartsparkles@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      This is StackOverflow after all. Your question is wrong. Your problem is wrong. You are wrong. I am right. Thread locked. Go read this other post that is totally unrelated to your problem I’ve decided isn’t the problem you’re facing because. I. Am. Right.

      • JackbyDev@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        I had a decade old question marked as a duplicate and downvoted three times after years no no activity. SE is such a joke nowadays.

      • Quetzalcutlass@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        3 months ago

        Could be worse. At least it’s not Microsoft’s support forums:

        Hey, I see you’re having problems with <copy-paste key words from OP>. Try the following and see if it fixes your issue.

        Open a command prompt and enter ”sfc /scannow".

        I hope this helps!

        (Reply marked as solution, thread closed.)

        • deadbeef79000@lemmy.nz
          link
          fedilink
          arrow-up
          0
          ·
          edit-2
          3 months ago

          I have X years experience with {keyword salad}.

          Can you confirm {details already in the opening post}?

        • Skull giver@popplesburger.hilciferous.nl
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          The thing with Windows is that the three magical commands (sfc, that DISM tool, fixboot) will usually fix most weird OS problems. To the point where any Windows troubleshooting session should include either the results of the first two, or instructions to use them.

          Once SFC and DISM can’t fix your install, you reinstall Windows. There are alternatives, but if you’d know them you wouldn’t be asking random Windows users on a forum. You can figure out a lot by enabling various tracing and logging features, listing open file handles and tracking file system calls, but the moment you need to take out sysmon you’re either in for a weekend of troubleshooting or wasting your time.

          Similarly, there are oneliners for Linux that’ll reinstall every package installed on the system and that has helped me recover my broken systems several times.

          • captain_samuel_brady@lemm.ee
            link
            fedilink
            arrow-up
            0
            ·
            3 months ago

            Magic may be an overstatement. I would be shocked if any of them fixed even 0.1% of the problems posted to Microsoft’s joke of a support forum where they were presented as solutions.

      • errer@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        That’s why LLMs are so infuriatingly stubborn, they’re trained on these keyboard warriors

    • kbal@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      It can be done with simple regex of the kind proposed in various answers there iff the html is known to be limited to the subset of html where that sort of thing can easily be made to work. The question does not tell us whether or not that is the case, so everyone is free to make their own assumptions and argue as if they know what’s going on.

  • listless@lemmy.cringecollective.io
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    https://www.zalgo.org/

    T̸̛̟͚͋͛̈̊͜͝Ờ̶̤̫̦͙̜̫͇͕͈̘̭̈̑̓̀̈́̌͊͛̆͐̌̈́͝ͅN̸̯̫̺̄̿̎͗͗́͜Y̷̢̱͚̖̤̠̞͉̅́̋̉̿̇̎̋͆͝͝ ̸̧̡̨̧̡̛̖̤̜͔̲̯̞͉͈̻̎̈̄̓̊̄́̕͘͝͠ͅT̷͎̝͌̅̔̓̒H̷̨̧̧̳̱̜͓̮͍̣̬̩̜̙͚̑̌́̑͋̽͗̎͑̊͛̍́͒̕͝͠Ḙ̵̥̥̘̻͔͛̑͒̿͋͝͝ ̶̡͚̬͈̏͌̓̔̈̔̀͌̔̓̾̓͘͝P̷͙̃́̈͐̆̂́͗̏͌̈́Ô̶͎͓̹͖̘̟̬͚̻̦̩͔͛͜͠ͅŅ̶͖̜̱͍̦̔̊͐͆̾̎́́̈́̄̓ͅẎ̸̨̭̜̼͎̜̜͕̥͙̼̤̟̞̄̊̂́ͅ ̴̡̡̛̲̟̳̯͔̝̟͙̌̽͋̏̾̆̅̏̐̅͑̿̀͒̉H̵̪̞̩̥̫̺̅̑̈́̾͌͛́̾̅̈͛͒̾̌̈͐͝Ȅ̶̘̲͙̖̬̞͕̱͍̥͈̦͈͍͔̩̑̒̐̇̑̈́̏͊̽͜͝͝͝ ̸̨̛̛̻̘̙̯̰̦̻͈͓̒̽̉̈̄̌̄͊͂̈͆ͅC̵͙̗̣̮͈̜̪̞̰̣͎̙̏̌̄͗͜Ȯ̸͇̖̼͈̗̝͔̜̘̲̦̦̾̃̆̍͝͝ͅM̷̨̧̮͕̠̘̔ͅÉ̶̡̡̢̡͕̺̗̩̝̩͇͓̄͐͆͛̔̈́̕͜ͅS̵̡͙̬͔̞̞̳͓̜͔͑̌̓̎͆͌̈͌̌̂͛̚͘͝

    • SpaceNoodle@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      d̶̢̢͖͉̪͖̠̹͇̺̜̼̦̗͍̓̅̊̋́̌̈́̌̐͐̔̅̄̚͜͝ͅͅȍ̷̢̧̦̼͉̝̦͓͖̽͜ǫ̷̫̤͐̀̾̈̇́̈́͛̐̔̀͜͜͝͝͠k̸̩̠̥̦̤̜͈͎̖̜̪̘͚̖̫̝̝͛̈́̇͒͜͝ͅì̷̧͈̥͇̤̝͈̹͕̽̑͌̐́̓̈́̈́ȩ̴̘̠͍͎̜̝̰̼̝̭̹͖͇͚̦͈̼̑̊͗́́̒͐̂̂̾̊̀͜͝

  • OpenStars@startrek.website
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Calendar, remind me to ask StackOverflow tomorrow if I can parse HTML with regexs get someone else to do my class homework for me?

    TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

    Sweet, my summoning spell worked!

  • solrize@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    3 months ago

    There is a famous Erik Naggum rant about XML at, no wait, I better not link it but you can find it with a search engine if you want, which means you don’t get to complain to me about it since you are the one who went looking for it. Very NSFW and VERY politically incorrect. Naggum died in 2009 but anyone who published a thing like that today would be raked over the coals.

  • fubo@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    Once you learn about parser combinators, all other parsing looks pretty dopey.

  • communism@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    OP isn’t trying to parse HTML though… they are trying to detect opening xml tags. Which seems quite achievable with regex.

    • winterayars@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      3 months ago

      It’s still actually pretty sketchy, depending on exactly what you want to do. Strict regex still won’t be able to match correctly if you want to match what an HTML parser considers the opening tag, though fancier regex will. If you’re just looking for the tags in the HTML document as a flat document it’s doable, though. (Mostly.)

  • HStone32@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    SO in a nutshell:

    “I need to do X”

    “Have you tried Y?”

    “No, because I don’t need Y, I need X.”

    “Well you can do Z if you can’t do Y.”

    “OK, sure. But how do I do X?”

    “Why do you need to do X?”

    (Explains why in my hyper-specific situation, I need to do X, and Y and Z won’t work)

    This question has been marked as a duplicate of “How to do Y”

    • winterayars@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      I would say it’s more like: “How can I do X?” “Here are some reasons you can’t do Y.”

      The answers should have been “Here are some reasons doing X is hard, but here’s an attempt at it anyway and also some more robust alternatives to doing X.” That would have been an excellent answer. (If you go down far enough you do start to see things like this but they’re hindered by people still responding that you can’t do Y or downvoting because they don’t understand what’s happening.)

    • interdimensionalmeme@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      Always start SO questions with X/Y problem pre-empting

      These people are everywhere and will stop at nothing to make you click on one of these

      https://xyproblem.info/ https://news.ycombinator.com/item?id=34444353 https://en.wikipedia.org/wiki/XY_problem

      They are trying to derail your question, which was already a generalized version of what your actual question was. And of course, you would need to explain everything you generalized out of your question (which would probably all get deleted by someone editing your question and removing all the irrelevant facts) by which point your question becomes so complicated nobody can answer it, even though they could have answered the generalized version.

      My advice, just use chatgpt or mistral, 99% you will get a better answer than stackoverflow. And you will get this actionnable answer IMMEDIATELY !

    • JackbyDev@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      3 months ago

      More like:

      • How can I do X?
      • Marked as duplicate of “How can I do Y?”

      Edit: I’ve got insomnia and don’t have my glasses on and misread the end.

    • purplemonkeymad@programming.dev
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      Except in 99% of cases the person is asking an xy problem, and if they ever explained the why, they would get a proper answer.

      Often the reason no one does the hyper-specific thing, is that there are better non code solutions, it’s massively insecure, or is just stupid micromanaging.

      • JackbyDev@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        3 months ago

        and if they ever explained the why, they would get a proper answer.

        That’s funny, every time I’ve explained in detail why my question isn’t a duplicate nobody fucking cares and it still gets closed.

      • HStone32@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        You know, when I typically ask a question on SO, its because I want to learn how that thing works, or how to write it myself. I usually say as much, but the SO folks are too focused on the ends, they completely neglect the means. Chances are I’m already aware of that no-code solution, but that’s not what I’m asking for.

        • orcrist@lemm.ee
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          I think there’s an element of responsibility that some people feel when they respond. If you’re asking for a very niche solution that is likely to create other problems in the future, should anyone else look at your code or refactor it or rely on it, or should you forget how it works, perhaps people are going to be less inclined in helping you craft it.

          If you still want to craft it, that’s okay, but you have to expect that some real percent of the answers are going to be those folk who know what the tried and true solution is, often because they’ve lived through the reality that you’re attempting to create and they’ve dealt with the aftermath of doing it special and different.

  • fluckx@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    So all the misery in the world is related to webdevs trying to parse html with regex?

    You bastards.

  • kbal@fedia.io
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    Using a regex on html is like eating wild mushrooms that you found in the woods. There are times where it’s appropriate and safe, other times where it’s completely insane and possibly deadly, and it takes considerable experience to know how to tell the difference.

  • Nariom@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    I once applied to an internship for a company doing job offers aggregation. During the interview they explained to me that the core of what they did was parsing (partial) html with regex. When I asked why they wouldn’t develop a custom parser, they replied to me that they were working on it, but that the internship wouldn’t focus on that. I was not disappointed when it didn’t get the job.