Recent addition of native image transformations is a very welcome feature, but it's lacking some documentation and has a couple of other problems:
. The :crop parameter of image specs doesn't seem to be documented anywhere. We previously supported it without documenting only for ImageMagick, which was already a problem; now it's a much bigger problem. . The ELisp manual doesn't mention that :rotation is generally supported only for multiples of 90 deg, except if ImageMagick is available. . The transformation matrix used by the implementation is not described; one needs to guess what its components mean, or search the Internet for docs of XRender and/or Cairo, which are generally not helpful enough, especially the XRender one. . There are no tests, AFAICT. We should have a simple manual test which could be used to exercise all of the transformations, separately and in combinations. I wonder how did the people who worked on the implementations for different platforns verify that the results are consistent, especially given the lack of documentation (e.g., is rotation of 90 deg clockwise or counter-clockwise?). . I question the wisdom of the current definition of image-transforms-p. First, it returns a simple yes/no value, under the assumption that either all of the transformations are supported or none of them; IMO, it is better to return a list of supported capabilities instead. Second, it doesn't distinguish between capability to rotate by arbitrary angles and by 90 deg multiples only, which will require Lisp programs to test for ImageMagick, something that undermines the very raison d'etre of this function Could someone please address these deficiencies ASAP? Without that, the feature seems incomplete and frankly not very clean. TIA |
On Tue, Jun 11, 2019 at 08:10:41AM +0300, Eli Zaretskii wrote:
> . The :crop parameter of image specs doesn't seem to be documented anywhere. We previously supported it without documenting only for ImageMagick, which was already a problem; now it's a much bigger problem. > . The ELisp manual doesn't mention that :rotation is generally supported only for multiples of 90 deg, except if ImageMagick is available. > . The transformation matrix used by the implementation is not described; one needs to guess what its components mean, or search the Internet for docs of XRender and/or Cairo, which are generally not helpful enough, especially the XRender one. I hope the attached patch goes some way to solving these issues. > . There are no tests, AFAICT. We should have a simple manual test > which could be used to exercise all of the transformations, > separately and in combinations. I wonder how did the people who > worked on the implementations for different platforns verify that > the results are consistent, especially given the lack of > documentation (e.g., is rotation of 90 deg clockwise or > counter-clockwise?). Is there some example of how to write a manual test like this? > . I question the wisdom of the current definition of > image-transforms-p. First, it returns a simple yes/no value, under > the assumption that either all of the transformations are supported > or none of them; IMO, it is better to return a list of supported > capabilities instead. Second, it doesn't distinguish between > capability to rotate by arbitrary angles and by 90 deg multiples > only, which will require Lisp programs to test for ImageMagick, > something that undermines the very raison d'etre of this function My hope was that MS Windows support for affine transform matrices would be forthcoming quite quickly in which case there would be no need to differentiate between the different types of transform (if one is supported, they all are), but perhaps that’s hoping for too much. :) I’ve been thinking a bit about the idea of returning some sort of capabilities list and it seems quite neat. We could perhaps roll some of the imagemagick-types stuff into it. I’ll look into it further. -- Alan Third 0001-Document-image-transforms.patch (3K) Download Attachment |
> Date: Tue, 11 Jun 2019 21:02:33 +0100
> From: Alan Third <[hidden email]> > Cc: [hidden email] > > On Tue, Jun 11, 2019 at 08:10:41AM +0300, Eli Zaretskii wrote: > > . The :crop parameter of image specs doesn't seem to be documented anywhere. We previously supported it without documenting only for ImageMagick, which was already a problem; now it's a much bigger problem. > > . The ELisp manual doesn't mention that :rotation is generally supported only for multiples of 90 deg, except if ImageMagick is available. > > . The transformation matrix used by the implementation is not described; one needs to guess what its components mean, or search the Internet for docs of XRender and/or Cairo, which are generally not helpful enough, especially the XRender one. > > I hope the attached patch goes some way to solving these issues. It's good progress, thanks. See a few comments below. > > . There are no tests, AFAICT. We should have a simple manual test > > which could be used to exercise all of the transformations, > > separately and in combinations. I wonder how did the people who > > worked on the implementations for different platforns verify that > > the results are consistent, especially given the lack of > > documentation (e.g., is rotation of 90 deg clockwise or > > counter-clockwise?). > > Is there some example of how to write a manual test like this? See test/manual/redisplay-testsuite.el, for example. The idea is to show some instructions, and then let the user/tester invoke operations being tested and observe the results. The epected results should be described as part of the instructions. > > . I question the wisdom of the current definition of > > image-transforms-p. First, it returns a simple yes/no value, under > > the assumption that either all of the transformations are supported > > or none of them; IMO, it is better to return a list of supported > > capabilities instead. Second, it doesn't distinguish between > > capability to rotate by arbitrary angles and by 90 deg multiples > > only, which will require Lisp programs to test for ImageMagick, > > something that undermines the very raison d'etre of this function > > My hope was that MS Windows support for affine transform matrices > would be forthcoming quite quickly in which case there would be no > need to differentiate between the different types of transform (if one > is supported, they all are), but perhaps that’s hoping for too much. :) Well, they don't really queue up for the job of making this work on Windows, do they? Part of the reason for my message was that I tried to figure out what would it take to provide these capabilities on Windows, and quickly got lost. I have no background in graphics programming, neither on Windows nor on any other platform, so for me good documentation is critical. Having this function be a simple boolean, on the assumption that all GUI platforms provide all the sub-features, would actually mean that this function is just a fancy alias for display-graphic-p. I don't think this is the best we can do. In addition, my research indicates that the equivalent features on Windows will have to use APIs that aren't available on Windows 9X (unless we decide to transform on pixel level by our own code, which I think is way too gross). So there will be cases where rotations will not be supported, even after the code to do that will have been written. Finally, there's the ImageMagick case, where we support rotations by arbitrary angles, and it would be good to be able to make that distinction with this single function, instead of making additional tests. > I’ve been thinking a bit about the idea of returning some sort of > capabilities list and it seems quite neat. We could perhaps roll some > of the imagemagick-types stuff into it. We have similar availability testing functions in gnutls.c. > +@item :crop @var{geometry} > +This should be a list of the form @code{(@var{width} @var{height} > +@var{x} @var{y})}. If @var{width} and @var{height} are positive > +numbers they specify the width and height of the cropped image. If > +@var{x} is a positive number it specifies the offset of the cropped > +area from the left of the original image, and if negative the offset > +from the right. If @var{y} is a positive number it specifies the > +offset from the top of the original image, and if negative from the > +bottom. If @var{x} or @var{y} are @code{nil} or unspecified the crop > +area will be centred on the original image. This says "If WIDTH and HEIGHT are positive numbers, ...", but doesn't say what happens if they are non-positive. > +Cropping is performed after scaling but before rotation. This sounds strange to me; are you sure? I'd expect cropping to be done either before everything else or after everything else. Is this so because that's how XRender does it? At the very least, it begs the question whether the parameters of :crop are measured in units before or after scaling. > @item :rotation @var{angle} > -Specifies a rotation angle in degrees. > +Specifies a rotation angle in degrees. Only multiples of 90 degrees > +are supported, unless the image type is @code{imagemagick}. Positive > +values rotate clockwise, negative values anti-clockwise. ^^^^^^^^^^^^^^ "counter-clockwise" is better, I think. > +/* image_set_rotation, image_set_crop, image_set_size and > + image_set_transform use affine transformation matrices to perform > + various transforms on the image. The matrix is a 2D array of > + doubles. It is laid out like this: > + > + m[0][0] = m11 | m[0][1] = m12 | m[0][2] = tx > + --------------+---------------+------------- > + m[1][0] = m21 | m[1][1] = m22 | m[1][2] = ty > + --------------+---------------+------------- > + m[2][0] = 0 | m[2][1] = 0 | m[2][2] = 1 Looking at the code, it seems that the matrix is actually defined like this: m[0][0] = m11 | m[0][1] = m12 | m[0][2] = 0 --------------+---------------+------------- m[1][0] = m21 | m[1][1] = m22 | m[1][2] = 0 --------------+---------------+------------- m[2][0] = tx | m[2][1] = ty | m[2][2] = 1 If not, I think the Cairo code takes the wrong components for its implementation. > + tx and ty represent translations, m11 and m22 represent scaling > + transforms and m21 and m12 represent shear transforms. Can you please add the equations used to perform this affine transformation, i.e. how x' and y' are computed from x and y? I think it will go a long way towards clarifying the processing. Also, as long as I have your attention: could you please tell what you get in the elements of matrix in image_set_size just before you pass the values to XRenderSetPictureTransform, when you evaluate this in *scratch*: (insert-image (create-image "splash.svg" 'svg nil :rotation 90)) I stepped through the code trying to figure out how to map these features to the equivalent Windows APIs, and I saw some results that confused me. After you show me the values you get in this use case, I might have a few follow-up questions about the code, to clarify my understanding of what the code does and what we expect from the backend when we hand it the matrix computed here. Thanks. |
On Wed, Jun 12, 2019 at 06:30:11PM +0300, Eli Zaretskii wrote:
> > > . There are no tests, AFAICT. We should have a simple manual test > > > which could be used to exercise all of the transformations, > > > separately and in combinations. I wonder how did the people who > > > worked on the implementations for different platforns verify that > > > the results are consistent, especially given the lack of > > > documentation (e.g., is rotation of 90 deg clockwise or > > > counter-clockwise?). > > > > Is there some example of how to write a manual test like this? > > See test/manual/redisplay-testsuite.el, for example. > > The idea is to show some instructions, and then let the user/tester > invoke operations being tested and observe the results. The epected > results should be described as part of the instructions. That’s what I imagined, but I didn’t know about the existing manual tests. Good to know they’re there. Thanks! > > My hope was that MS Windows support for affine transform matrices > > would be forthcoming quite quickly in which case there would be no > > need to differentiate between the different types of transform (if one > > is supported, they all are), but perhaps that’s hoping for too much. :) > > Well, they don't really queue up for the job of making this work on > Windows, do they? Indeed. > Part of the reason for my message was that I tried to figure out what > would it take to provide these capabilities on Windows, and quickly > got lost. I have no background in graphics programming, neither on > Windows nor on any other platform, so for me good documentation is > critical. > > Having this function be a simple boolean, on the assumption that all > GUI platforms provide all the sub-features, would actually mean that > this function is just a fancy alias for display-graphic-p. I don't > think this is the best we can do. > > In addition, my research indicates that the equivalent features on > Windows will have to use APIs that aren't available on Windows 9X > (unless we decide to transform on pixel level by our own code, which I > think is way too gross). So there will be cases where rotations will > not be supported, even after the code to do that will have been > written. > > Finally, there's the ImageMagick case, where we support rotations by > arbitrary angles, and it would be good to be able to make that > distinction with this single function, instead of making additional > tests. To be frank the only reason I added this function is because having XRender doesn’t guarantee every frame will be able to perform these transforms. I imagine that most modern hardware (from the last 15 years or so) will be able to handle it, but older hardware may be using graphics cards that don’t support transforms. I think this means you can end up in the situation where a frame on monitor one, running on graphics card one, will work, but a frame on monitor two on graphics card two won’t. I think this is potentially an even worse situation than on Windows, but probably very rare. However extending image-transforms-p, or coming up with another implementation, makes a lot of sense. > > I’ve been thinking a bit about the idea of returning some sort of > > capabilities list and it seems quite neat. We could perhaps roll some > > of the imagemagick-types stuff into it. > > We have similar availability testing functions in gnutls.c. Thanks, I’ll have a look. > > +Cropping is performed after scaling but before rotation. > > This sounds strange to me; are you sure? I'd expect cropping to be > done either before everything else or after everything else. Is this > so because that's how XRender does it? At the very least, it begs the > question whether the parameters of :crop are measured in units before > or after scaling. I agree, but this is how our imagemagick code does it and I didn’t want to make my code behave differently, even though I think it makes no sense. It’s easy enough to re‐order the events. In the native transforms code you simply reorder these function calls in lookup_image: image_set_size (img, transform_matrix); image_set_crop (img, transform_matrix); image_set_rotation (img, transform_matrix); and reorder some of the code in imagemagick_load_image. IMO the best order is probably crop, rotate and resize. I think there’s an argument for putting resize before rotate, but rotating by 90 degrees after setting the size would mean :max-width and :max-height affecting the wrong dimensions, as they do at the moment. Alternatively we could split resizing so :max-width and :max-height always operate last. Probably not worth it, though. Just put resizing last. > > +/* image_set_rotation, image_set_crop, image_set_size and > > + image_set_transform use affine transformation matrices to perform > > + various transforms on the image. The matrix is a 2D array of > > + doubles. It is laid out like this: > > + > > + m[0][0] = m11 | m[0][1] = m12 | m[0][2] = tx > > + --------------+---------------+------------- > > + m[1][0] = m21 | m[1][1] = m22 | m[1][2] = ty > > + --------------+---------------+------------- > > + m[2][0] = 0 | m[2][1] = 0 | m[2][2] = 1 > > Looking at the code, it seems that the matrix is actually defined like > this: > > m[0][0] = m11 | m[0][1] = m12 | m[0][2] = 0 > --------------+---------------+------------- > m[1][0] = m21 | m[1][1] = m22 | m[1][2] = 0 > --------------+---------------+------------- > m[2][0] = tx | m[2][1] = ty | m[2][2] = 1 > > If not, I think the Cairo code takes the wrong components for its > implementation. You’re right. > > + tx and ty represent translations, m11 and m22 represent scaling > > + transforms and m21 and m12 represent shear transforms. > > Can you please add the equations used to perform this affine > transformation, i.e. how x' and y' are computed from x and y? I think > it will go a long way towards clarifying the processing. I’ll add some further explanations of how to use the affine transformation matrices, but I don’t know that I’ll be able to do a very good job of explaining exactly how they work. I would suggest that if someone is interested they look it up elsewhere, however I also don’t think it’s necessary to fully understand the maths to be able to use them. I’ll provide an updated patch soon. > Also, as long as I have your attention: could you please tell what you > get in the elements of matrix in image_set_size just before you pass > the values to XRenderSetPictureTransform, when you evaluate this in > *scratch*: > > (insert-image (create-image "splash.svg" 'svg nil :rotation 90)) > > I stepped through the code trying to figure out how to map these > features to the equivalent Windows APIs, and I saw some results that > confused me. After you show me the values you get in this use case, I > might have a few follow-up questions about the code, to clarify my > understanding of what the code does and what we expect from the > backend when we hand it the matrix computed here. Using this code at the top of image_set_transform: fprintf(stderr, "matrix:\n"); fprintf(stderr, "%f %f %f\n",matrix[0][0], matrix[1][0], matrix[2][0]); fprintf(stderr, "%f %f %f\n",matrix[0][1], matrix[1][1], matrix[2][1]); fprintf(stderr, "%f %f %f\n",matrix[0][2], matrix[1][2], matrix[2][2]); I get this printed (on macOS, but it should be the same everywhere): matrix: 0.000000 1.000000 0.000000 -1.000000 0.000000 232.000000 0.000000 0.000000 1.000000 I don’t know exactly what that means. The 1 and -1 are shearing the image in the x and y dimensions. The 232 is moving the image in the y dimension. It’s actually moving the origin, so a positive value moves the origin downwards. The full process of rotating an image is to move the origin to the centre of the image (width/2, height/2); perform the rotation, which is some combination of shearing and resizing, but appears in this case to not involve any actual resizing; and finally move the origin back to the top left of the image, which may now be a different corner. This is done by creating a matrix for each of those actions, then multiplying the transformation matrix (the one we pass into each function) by each of those matrices in order (matrix multiplication is not commutative). After we’ve done that we can use our modified transformation matrix to transform points. I believe we take the x and y coordinates and convert them into a 3x1 matrix and multiply that by the transformation matrix and it gives us a new set of coordinates. [x] [m11 m12 m13] [y] X [m21 m22 m23] = [x’ y’ 0] [0] [m31 m32 m33] Luckily we don’t have to worry about the last step as the graphics toolkit will do it for us. Cropping is easier as we just move the origin to the top left of where we want to crop and set the width and height accordingly. The matrices don’t know anything about width and height. Scaling is also simple as you can set m11 to scale x, m22 to scale y and, I think, m33 to scale both by the same value. My code ignores m33. And of course you have to set the width and height accordingly again. It’s possible to pre‐calculate the matrix multiplications and just generate one transform matrix that will do everything we need in a single step, but the maths for each element is much more complex and I thought it was better to lay out the steps separately. (perhaps I should just put the above into the comment in image.c) -- Alan Third |
On Wed, Jun 12, 2019 at 11:07:46PM +0100, Alan Third wrote:
> > > + tx and ty represent translations, m11 and m22 represent scaling > > > + transforms and m21 and m12 represent shear transforms. Having written a huge tract and messed around with outputting various transform matrices, it seems clear to me now that the above is not quite as straightforward as I was thinking. A 90 degree rotation followed by a resize doesn’t actually put anything in m11 and m22, so clearly m11, m12, m21 and m22 interact in mysterious ways. My feeble excuse is that I’m not a mathematician. -- Alan Third |
In reply to this post by Alan Third
On Wed, Jun 12, 2019 at 6:09 PM Alan Third <[hidden email]> wrote:
> > matrix: > 0.000000 1.000000 0.000000 > -1.000000 0.000000 232.000000 > 0.000000 0.000000 1.000000 > > I don’t know exactly what that means. The 1 and -1 are shearing the > image in the x and y dimensions. The 232 is moving the image in the y > dimension This is a transformation matrix using so-called homogenous coordinates: https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations It's a clockwise 90 degree rotation followed by a translation along the y axis. In general you can't assign a geometric meaning to m11, m12, m21, m22 taken individually; whether they represent rotation, shearing, or scaling depends on their relative values. > I believe we take the x and y coordinates and convert them into a 3x1 matrix > and multiply that by the transformation matrix and it gives us a new set of > coordinates. > > [x] [m11 m12 m13] > [y] X [m21 m22 m23] = [x’ y’ 0] > [0] [m31 m32 m33] You need to use 1 instead of 0 when translating from Cartesian to homogenous coordinates. That is, given a point (x, y), you find (x', y') as follows. Multiply (x, y, 1) by the transformation matrix. Let the result be (a, b, c). Then the new point (x', y') in Cartesian coordinates is (a/c, b/c). When dealing only with affine transformations the procedure is simpler. Such transformations can always be described by a matrix where m31 == m32 == 0 and m33 == 1. In that case, the result of multiplication will have the form (a, b, 1), so x' == a and y' == b. |
> From: Alp Aker <[hidden email]>
> Date: Thu, 13 Jun 2019 00:16:36 -0400 > Cc: Eli Zaretskii <[hidden email]>, Emacs devel <[hidden email]> > > > matrix: > > 0.000000 1.000000 0.000000 > > -1.000000 0.000000 232.000000 > > 0.000000 0.000000 1.000000 > > > > I don’t know exactly what that means. The 1 and -1 are shearing the > > image in the x and y dimensions. The 232 is moving the image in the y > > dimension > > This is a transformation matrix using so-called homogenous coordinates: > > https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations Right, I got that far, it's the details that somewhat confuse me, see below. > It's a clockwise 90 degree rotation followed by a translation along the y > axis. This already goes contrary to my geometric intuition, please bear with me. The rotation is around the (0,0) origin, i.e. around the top-left corner of the original image, right? If so, the rotation should have been followed by a translation along the X axis, not Y, because rotating a 333-pixel wide, 233-pixel high image 90 deg clockwise produces a 233-pixel wide, 333-pixel high image that is entirely to the LEFT of the Y axis. Here's ASCII-art representation of that: +------------------+> X +----------+-------------------> X | | | | | | | | | | | | | | ===> | | +------------------+ | | | | | | | | | | | | +----------+ | | V V Y Y The above is just after the rotation around (0,0). Is that correct, or am I missing something? I also tried to approach this from the matrix notation aspect. Is the following the correct equations of computing (x',y'), the new coordinates of any pixel of the image, from its original coordinates (x,y)? x' = m11 * x + m12 * y + tx y' = m21 * x + m22 * y + ty where the factors are related to the matrix as follows: m[0][0] = m11 | m[0][1] = m12 | m[0][2] = 0 --------------+---------------+------------- m[1][0] = m21 | m[1][1] = m22 | m[1][2] = 0 --------------+---------------+------------- m[2][0] = tx | m[2][1] = ty | m[2][2] = 1 If the above is correct, then the transformation of the top-left corner of the original image, whose original coordinates are (0,0), yield x' = 0 * x + -1 * y + 0 = 0 y' = 1 * x + 0 * y + 232 = 232 which is incorrect, since the correct coordinates should be (233,0), not (0,232). (The 232 vs 233 is some kind of round-off, but let's ignore that for now.) What am I missing here? There's also the issue of cropping, which the current code in image.c seems to represent as some kind of translation with change in image size. But my, perhaps incorrect, interpretation of translation is that the entire image is shifted along X and Y axes, which is not what cropping is about, AFAIU. Again, I'm probably missing something very fundamental here. > In general you can't assign a geometric meaning to m11, m12, m21, m22 > taken individually; whether they represent rotation, shearing, or scaling > depends on their relative values. > > > I believe we take the x and y coordinates and convert them into a 3x1 > matrix > > and multiply that by the transformation matrix and it gives us a new set > of > > coordinates. > > > > [x] [m11 m12 m13] > > [y] X [m21 m22 m23] = [x’ y’ 0] > > [0] [m31 m32 m33] > > You need to use 1 instead of 0 when translating from Cartesian to homogenous > coordinates. That is, given a point (x, y), you find (x', y') as > follows. Multiply (x, y, 1) by the transformation matrix. Let the result > be > (a, b, c). Then the new point (x', y') in Cartesian coordinates is (a/c, > b/c). > > When dealing only with affine transformations the procedure is > simpler. Such transformations can always be described by a matrix > where m31 == m32 == 0 and m33 == 1. In that case, the result of > multiplication will have the form (a, b, 1), so x' == a and y' == b. Sorry, now I'm even more confused: aren't we dealing with affine transformations? Then how are homogeneous coordinates related to this? And does that mean the formulae for calculating (x',y') I show above are incorrect? Thanks. |
In reply to this post by Alan Third
> Date: Wed, 12 Jun 2019 23:07:46 +0100
> From: Alan Third <[hidden email]> > Cc: [hidden email] > > > > +Cropping is performed after scaling but before rotation. > > > > This sounds strange to me; are you sure? I'd expect cropping to be > > done either before everything else or after everything else. Is this > > so because that's how XRender does it? At the very least, it begs the > > question whether the parameters of :crop are measured in units before > > or after scaling. > > I agree, but this is how our imagemagick code does it and I didn’t > want to make my code behave differently, even though I think it makes > no sense. OK, but what about the question regarding the units of :crop parameters -- should they be interpreted as before or after the scaling? > > Can you please add the equations used to perform this affine > > transformation, i.e. how x' and y' are computed from x and y? I think > > it will go a long way towards clarifying the processing. > > I’ll add some further explanations of how to use the affine > transformation matrices, but I don’t know that I’ll be able to do a > very good job of explaining exactly how they work. I would suggest > that if someone is interested they look it up elsewhere, however I > also don’t think it’s necessary to fully understand the maths to be > able to use them. I have shown my interpretation of the equations. Trouble is, I cannot find what XRender does anywhere. Does someone know where to look for that? > I’ll provide an updated patch soon. Thanks. > > (insert-image (create-image "splash.svg" 'svg nil :rotation 90)) > > > > I stepped through the code trying to figure out how to map these > > features to the equivalent Windows APIs, and I saw some results that > > confused me. After you show me the values you get in this use case, I > > might have a few follow-up questions about the code, to clarify my > > understanding of what the code does and what we expect from the > > backend when we hand it the matrix computed here. > > Using this code at the top of image_set_transform: > > fprintf(stderr, "matrix:\n"); > fprintf(stderr, "%f %f %f\n",matrix[0][0], matrix[1][0], matrix[2][0]); > fprintf(stderr, "%f %f %f\n",matrix[0][1], matrix[1][1], matrix[2][1]); > fprintf(stderr, "%f %f %f\n",matrix[0][2], matrix[1][2], matrix[2][2]); > > I get this printed (on macOS, but it should be the same everywhere): > > matrix: > 0.000000 1.000000 0.000000 > -1.000000 0.000000 232.000000 > 0.000000 0.000000 1.000000 That's what I get, except that you've printed the matrix in column-major order. I described my conceptual problems with these values in another message. > Luckily we don’t have to worry about the last step as the graphics > toolkit will do it for us. Unfortunately, I do have to worry about all of the steps, because I need to figure out how to map all this to the equivalent Windows APIs. Thus my questions, for which I apologize. I'd prefer that someone more knowledgeable about graphics programming did the changes on Windows, but no one stepped forward yet. > (perhaps I should just put the above into the comment in image.c) Yes, please. Thanks. |
In reply to this post by Eli Zaretskii
On Thu, Jun 13, 2019 at 1:41 AM Eli Zaretskii <[hidden email]> wrote: > > This already goes contrary to my geometric intuition, please bear with > me. The rotation is around the (0,0) origin, i.e. around the top-left > corner of the original image, right? If so, the rotation should have > been followed by a translation along the X axis, not Y The last sentence is problematic. When a transformation matrix describes a rotation followed by a translation, the translation is specified relative to the fixed coordinate axes. The rotation doesn't affect the direction of the translation. It's perhaps unintuitive, but that's the way the formalism is defined. We could interpret "rotate 90 then translate" in the way you describe, but then the translation into algebra would be different. > rotating a 333-pixel wide, 233-pixel high image 90 deg clockwise > produces a 233-pixel wide, 333-pixel high image that is entirely to > the LEFT of the Y axis. Here's ASCII-art representation of that: > > +------------------+> X +----------+-------------------> X > | | | | > | | | | > | | | | > | | ===> | | > +------------------+ | | > | | | > | | | > | | | > | +----------+ > | | > V V > Y Y > > The above is just after the rotation around (0,0). Is that correct, > or am I missing something? That's correct. > I also tried to approach this from the matrix notation aspect. Is the > following the correct equations of computing (x',y'), the new > coordinates of any pixel of the image, from its original coordinates > (x,y)? > > x' = m11 * x + m12 * y + tx > y' = m21 * x + m22 * y + ty > > where the factors are related to the matrix as follows: > > m[0][0] = m11 | m[0][1] = m12 | m[0][2] = 0 > --------------+---------------+------------- > m[1][0] = m21 | m[1][1] = m22 | m[1][2] = 0 > --------------+---------------+------------- > m[2][0] = tx | m[2][1] = ty | m[2][2] = 1 I confess I'm not sure how to interpret that matrix. I just looked through image_set_rotation and found it somewhat confusing, as it seems to use column-major representation where I'd expect row-major. E.g., the above matrix looks odd to me, because tx and ty would normally be in m[0][2] and m[1][2] (and I'd expect m[2][0] == m[2][1] == 0). Similarly, the rotation matrix used in image_set_rotation: [0][0] = cos_r, [0][1] = -sin_r [1][0] = sin_r, [1][1] = cos_r would normally describe a counter-clockwise rotation by r, not a clockwise rotation. That said, if I correctly understand the layout of the data, the equations should be: x' = m11 * x + m21 * y + tx y' = m12 * x + m22 * y + ty > If the above is correct, then the transformation of the top-left > corner of the original image, whose original coordinates are (0,0), > [...] > the correct coordinates should be (233,0), not (0,232). > What am I missing here? The transformation described by the matrix is: rotate 90 degrees around the origin, then translate by 232 along the y axis. The first operation leaves (0, 0) unmoved, then the second operation moves it to (0, 232). > Sorry, now I'm even more confused: aren't we dealing with affine > transformations? Then how are homogeneous coordinates related to > this? And does that mean the formulae for calculating (x',y') I show > above are incorrect? I was being unnecessarily general, which caused confusion; my apologies. Probably best for present purposes to ignore what I said there. (What I meant: Transformation matrices can be used for both affine and non-affine transformations. I first described the general case, then described how the calculations work when we restrict to affine transformations. It's homogeneous coordinates in both cases, though. I also wrote this last part before noticing the transposition issue I mentioned above, which probably adds to the confusion.) |
> From: Alp Aker <[hidden email]>
> Date: Thu, 13 Jun 2019 05:19:52 -0400 > Cc: Alan Third <[hidden email]>, Emacs devel <[hidden email]> > > > This already goes contrary to my geometric intuition, please bear with > > me. The rotation is around the (0,0) origin, i.e. around the top-left > > corner of the original image, right? If so, the rotation should have > > been followed by a translation along the X axis, not Y > > The last sentence is problematic. When a transformation matrix describes a > rotation followed by a translation, the translation is specified relative > to the fixed coordinate axes. The rotation doesn't affect the > direction of the translation. That's right, but this is exactly what I was trying to describe. When I wrote "translation along the X axis", I meant the original X axis, which is unaffected by the rotation. Are you saying that my expectations are incorrect in that interpretation of "X axis"? > > +------------------+> X +----------+-------------------> X > > | | | | > > | | | | > > | | | | > > | | ===> | | > > +------------------+ | | > > | | | > > | | | > > | | | > > | +----------+ > > | | > > V V > > Y Y > > > > The above is just after the rotation around (0,0). Is that correct, > > or am I missing something? > > That's correct. > > > I also tried to approach this from the matrix notation aspect. Is the > > following the correct equations of computing (x',y'), the new > > coordinates of any pixel of the image, from its original coordinates > > (x,y)? > > > > x' = m11 * x + m12 * y + tx > > y' = m21 * x + m22 * y + ty > > > > where the factors are related to the matrix as follows: > > > > m[0][0] = m11 | m[0][1] = m12 | m[0][2] = 0 > > --------------+---------------+------------- > > m[1][0] = m21 | m[1][1] = m22 | m[1][2] = 0 > > --------------+---------------+------------- > > m[2][0] = tx | m[2][1] = ty | m[2][2] = 1 > > I confess I'm not sure how to interpret that matrix. I just looked through > image_set_rotation and found it somewhat confusing, as it seems to use > column-major representation where I'd expect row-major. E.g., the above > matrix > looks odd to me, because tx and ty would normally be in m[0][2] and m[1][2] > (and > I'd expect m[2][0] == m[2][1] == 0). Similarly, the rotation matrix used in > image_set_rotation: > > [0][0] = cos_r, [0][1] = -sin_r > [1][0] = sin_r, [1][1] = cos_r > > would normally describe a counter-clockwise rotation by r, not a clockwise > rotation. Maybe that's the problem: if the rotation is counter-clockwise, then the translation should indeed be along the Y axis. > That said, if I correctly understand the layout of the data, the equations > should be: > > x' = m11 * x + m21 * y + tx > y' = m12 * x + m22 * y + ty AFAIU, this indeed describes a counter-clockwise rotation, not a clockwise rotation. > > the correct coordinates should be (233,0), not (0,232). > > What am I missing here? > > The transformation described by the matrix is: rotate 90 degrees > around the origin, then translate by 232 along the y axis. The > first operation leaves (0, 0) unmoved, then the second operation > moves it to (0, 232). But if the rotation is clockwise, the result should be (233,0), right? Thank you for helping me figure out this stuff. |
On Thu, Jun 13, 2019 at 9:05 AM Eli Zaretskii <[hidden email]> wrote:
> > That's right, but this is exactly what I was trying to describe. When > I wrote "translation along the X axis", I meant the original X axis, > which is unaffected by the rotation. Are you saying that my > expectations are incorrect in that interpretation of "X axis"? > [...] > Maybe that's the problem: if the rotation is counter-clockwise, then > the translation should indeed be along the Y axis. The translation doesn't depend on the rotation in any way. The general form of an affine transformation is F(x) + b, where F is a linear transformation (rotation, shear, scaling) and addition by b is translation. In the example we're discussing, F is rotation by 90 degrees and b is (0, 232) in some coordinates. F does not act on b; the rotation does not affect what the translation does; whether F is clockwise or counter-clockwise rotation, the function displaces the result of the rotation along the (original) y axis. (The single-matrix form used by the transformation code in image.c is a computational convenience; the function it expresses still has the form F(x) + b.) |
In reply to this post by Eli Zaretskii
On Thu, Jun 13, 2019 at 08:41:02AM +0300, Eli Zaretskii wrote:
> > From: Alp Aker <[hidden email]> > > Date: Thu, 13 Jun 2019 00:16:36 -0400 > > Cc: Eli Zaretskii <[hidden email]>, Emacs devel <[hidden email]> > > > > > matrix: > > > 0.000000 1.000000 0.000000 > > > -1.000000 0.000000 232.000000 > > > 0.000000 0.000000 1.000000 > > > > > > I don’t know exactly what that means. The 1 and -1 are shearing the > > > image in the x and y dimensions. The 232 is moving the image in the y > > > dimension > > > > This is a transformation matrix using so-called homogenous coordinates: > > > > https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations > > Right, I got that far, it's the details that somewhat confuse me, see > below. > > > It's a clockwise 90 degree rotation followed by a translation along the y > > axis. > > This already goes contrary to my geometric intuition, please bear with > me. The rotation is around the (0,0) origin, i.e. around the top-left > corner of the original image, right? If so, the rotation should have > been followed by a translation along the X axis, not Y, because > rotating a 333-pixel wide, 233-pixel high image 90 deg clockwise > produces a 233-pixel wide, 333-pixel high image that is entirely to > the LEFT of the Y axis. Here's ASCII-art representation of that: > > +------------------+> X +----------+-------------------> X > | | | | > | | | | > | | | | > | | ===> | | > +------------------+ | | > | | | > | | | > | | | > | +----------+ > | | > V V > Y Y > > The above is just after the rotation around (0,0). Is that correct, > or am I missing something? I think I confused things by saying ‘followed by’, as I think they probably happen simultaneously. It works if you consider it as moving the origin from the top left corner, 232 pixels down the Y axis to the bottom left corner, then rotating. I don’t really know how to think about this that deeply, especially since this matrix is the result of two translations and a rotation multiplied together. One thing that may also be confusing is that there are two different approaches to this. XRender applies the transforms to the image, whereas NS applies the tranforms to the surface the image is to be drawn to. I have a suspicion, having read some Windows API documentation (but not much) that Windows works the same way as NS. This is unfortunate as it’s harder to understand what’s going on. A way to visualise the difference is that XRender is like taking a photo and rotating and moving it around before putting it on a bit of paper. The NS method is like holding a photo still and moving the bit of paper under it. The key difference is that for NS I have to invert the transformation matrix. I also have to take more care with where I place the origin before applying the transformation matrix. I could be wrong, but it may explain why things aren’t doing what you’re expecting. Or it could simply be down to the fact I transposed the rows and columns. I think it’s probably a good idea for us to deal with the transposition first before making any definite statements on this. -- Alan Third |
In reply to this post by Alp Aker-4
> From: Alp Aker <[hidden email]>
> Date: Thu, 13 Jun 2019 11:57:34 -0400 > Cc: Alan Third <[hidden email]>, Emacs devel <[hidden email]> > > > Maybe that's the problem: if the rotation is counter-clockwise, then > > the translation should indeed be along the Y axis. > > The translation doesn't depend on the rotation in any way. I think the final value of translation does depend on the rotation, because the center of the image's position after the rotation depends on the rotation angle. > The general form of an affine transformation is F(x) + b, where F is > a linear transformation (rotation, shear, scaling) and addition by b > is translation. Yes, but our transformation includes translation, rotation, and another translation. Not just one translation. IOW, it isn't the transformation matrix that is given; it's the operation on the image. The transformation matrix surely depends on whether the rotation is clockwise or counter-clockwise. Anyway, until we agree on the equations that convert (x,y) into (x',y'), it is IMO pointless to argue about details. So can anyone tell where the meaning of the matrix passed to XRender is described? |
In reply to this post by Eli Zaretskii
On Thu, Jun 13, 2019 at 08:48:48AM +0300, Eli Zaretskii wrote:
> > Date: Wed, 12 Jun 2019 23:07:46 +0100 > > From: Alan Third <[hidden email]> > > Cc: [hidden email] > > > > > > +Cropping is performed after scaling but before rotation. > > > > > > This sounds strange to me; are you sure? I'd expect cropping to be > > > done either before everything else or after everything else. Is this > > > so because that's how XRender does it? At the very least, it begs the > > > question whether the parameters of :crop are measured in units before > > > or after scaling. > > > > I agree, but this is how our imagemagick code does it and I didn’t > > want to make my code behave differently, even though I think it makes > > no sense. > > OK, but what about the question regarding the units of :crop > parameters -- should they be interpreted as before or after the > scaling? After the scaling. > > > Can you please add the equations used to perform this affine > > > transformation, i.e. how x' and y' are computed from x and y? I think > > > it will go a long way towards clarifying the processing. > > > > I’ll add some further explanations of how to use the affine > > transformation matrices, but I don’t know that I’ll be able to do a > > very good job of explaining exactly how they work. I would suggest > > that if someone is interested they look it up elsewhere, however I > > also don’t think it’s necessary to fully understand the maths to be > > able to use them. > > I have shown my interpretation of the equations. Trouble is, I cannot > find what XRender does anywhere. Does someone know where to look for > that? I suspect we’d have to dive into the code to see exactly what XRender is doing, however from my own testing I believe the transform matrix passed into XRender works exactly as described here: https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations I can’t find a great explanation of how exactly transformation matrices work, but there are a lot of explanations available. > > Luckily we don’t have to worry about the last step as the graphics > > toolkit will do it for us. > > Unfortunately, I do have to worry about all of the steps, because I > need to figure out how to map all this to the equivalent Windows APIs. > Thus my questions, for which I apologize. I'd prefer that someone > more knowledgeable about graphics programming did the changes on > Windows, but no one stepped forward yet. You shouldn’t really have to fully understand the maths to implement this, so clearly we’re going wrong somewhere. Can you point me to the Windows API documentation and perhaps I can work out how exactly we need to approach this? -- Alan Third |
In reply to this post by Alan Third
> Date: Thu, 13 Jun 2019 17:12:15 +0100
> From: Alan Third <[hidden email]> > Cc: Alp Aker <[hidden email]>, [hidden email] > > It works if you consider it as moving the origin from the top left > corner, 232 pixels down the Y axis to the bottom left corner, then > rotating. I don’t really know how to think about this that deeply, > especially since this matrix is the result of two translations and a > rotation multiplied together. OK, but how to be sure this is the correct interpretation? The code which implements the rotations does a translation, followed by rotation, followed by another translation, and multiplies all the 3 matrices to produce the result. If we are not sure about what these transformations mean, how do we know the result is correct? How did _you_ know to write that code? is there some XRender elated documentation that describes the meaning of each matrix element? For example, do the translations describe how the pixels are moved or how the origin of the coordinate system is moved? > One thing that may also be confusing is that there are two different > approaches to this. XRender applies the transforms to the image, > whereas NS applies the tranforms to the surface the image is to be > drawn to. I have a suspicion, having read some Windows API > documentation (but not much) that Windows works the same way as NS. > This is unfortunate as it’s harder to understand what’s going on. If everybody and their dog work differently, why are we doing this according to XRender, and not the other way around? > The key difference is that for NS I have to invert the transformation > matrix. What do you mean by "invert", and where is that NS code? And if you figured out how to map what XRender does to what NS does, you probably understand well what the XRender matrix means, right? > I could be wrong, but it may explain why things aren’t doing what > you’re expecting. Or it could simply be down to the fact I transposed > the rows and columns. I think it’s probably a good idea for us to deal > with the transposition first before making any definite statements on > this. Not sure what you mean by "deal with transposition". Thanks. |
In reply to this post by Alan Third
> Date: Thu, 13 Jun 2019 17:58:04 +0100
> From: Alan Third <[hidden email]> > Cc: [hidden email] > > Can you point me to the Windows API documentation and perhaps I can > work out how exactly we need to approach this? I want to use PlgBlt, see https://docs.microsoft.com/en-us/windows/desktop/api/wingdi/nf-wingdi-plgblt For that, I need to compute, for each of the original image's vertices, the coordinates of the corresponding vertex of the transformed image. That's why I was asking how to interpret the matrix elements for transforming pixel coordinates. I also want to know when no rotation is involved, so that I could use the existing code (which only supports scaling and will be modified to support cropping) on older Windows versions that cannot support rotations (PlgBlt is not available on those platforms). Thanks. |
In reply to this post by Alan Third
> Date: Thu, 13 Jun 2019 17:58:04 +0100
> From: Alan Third <[hidden email]> > Cc: [hidden email] > > Can you point me to the Windows API documentation and perhaps I can > work out how exactly we need to approach this? I of course appreciate the offer, and responded with the information. But really, this kind of effort on your part is not necessary. If you can add a test for these capabilities, it shouldn't take too long to figure out what the matrix means by comparing the effect to the expected results. So maybe working on the tests will be the most effective use of your time in this matter. |
In reply to this post by Eli Zaretskii
On Thu, 13 Jun 2019 at 18:57, Eli Zaretskii <[hidden email]> wrote:
> From: Alp Aker <[hidden email]> > Date: Thu, 13 Jun 2019 11:57:34 -0400 > Cc: Alan Third <[hidden email]>, Emacs devel <[hidden email]> > > > Maybe that's the problem: if the rotation is counter-clockwise, then > > the translation should indeed be along the Y axis. > > The translation doesn't depend on the rotation in any way. I think the final value of translation does depend on the rotation, because the center of the image's position after the rotation depends on the rotation angle. > The general form of an affine transformation is F(x) + b, where F is > a linear transformation (rotation, shear, scaling) and addition by b > is translation. Yes, but our transformation includes translation, rotation, and another translation. Not just one translation. IOW, it isn't the transformation matrix that is given; it's the operation on the image. XRender uses homogeneous matrices. In this system a 3x3 projective matrix can represent any affine transformation as one matrix. The matrix representing a translation, then a rotation, then a translation, can then be calculated by multiplying three matrices. (It is in fact again a rotation followed by a single translation.) The 9-coordinate homogeneous matrix can represent any projective transformation. Simplifying to the case of affine transformations, the matrix is of the form [Fxx Fxy Tx] [Fyx Fyy Ty] [ 0 0 1] Fij represent the scale/rotation/shear and Ti represent the translation. Multiplying two matrices of this form gives another matrix of the same form. To transform coordinates (X Y) you multiply the matrix (on the left) by a column vector (on the right) as follows. [Fxx Fxy Tx] [X] [Fxx * X + Fxy * Y + Tz] [Fyx Fyy Ty] [Y] = [Fyx * X + Fyy * Y + Ty] [ 0 0 1] [1] [ 1] A pure translation goes like this: [1 0 Tx] [X] [X + Tx] [0 1 Ty] [Y] = [Y + Ty] [0 0 1] [1] [ 1] The transformation matrix surely depends on whether the rotation is clockwise or counter-clockwise. The origin is at the top left so a pure rotation clockwise about the origin through angle a goes like this: [cos(a) -sin(a) 0] [X] [cos(a) * X - sin(a) * Y] [sin(a) cos(a) 0] [Y] = [sin(a) * X + cos(a) * Y] [ 0 0 1] [1] [ 1] To combine several transformations (e.g., to get a matrix that does a transformation, then a rotation, then a transformation) you multiply the matrices of the transformations together. The matrix for the last transformation goes furthest left. Anyway, until we agree on the equations that convert (x,y) into (x',y'), it is IMO pointless to argue about details. So can anyone tell where the meaning of the matrix passed to XRender is described? The documentation is here: <https://www.x.org/releases/X11R7.7/doc/libXrender/libXrender.txt> What is not specified is the memory layout of the matrix. Some more details are here: <http://www.talisman.org/~erlkonig/misc/x11-composite-tutorial/>. From the tutorial, I guess the memory layout is XTransform xform = {{ { XDoubleToFixed(Fxx), XDoubleToFixed(Fyx), XDoubleToFixed(0) }, { XDoubleToFixed(Fxy), XDoubleToFixed(Fyy), XDoubleToFixed(0) }, { XDoubleToFixed(Tx), XDoubleToFixed(Ty), XDoubleToFixed(1) } }}; It could equally be XTransform xform = {{ { XDoubleToFixed(Fxx), XDoubleToFixed(Fxy), XDoubleToFixed(Tx) }, { XDoubleToFixed(Fyx), XDoubleToFixed(Fyy), XDoubleToFixed(Ty) }, { XDoubleToFixed(0), XDoubleToFixed(0), XDoubleToFixed(1) } }}; Someone can work out which through experimenation. Here I have fixed three of the matrix entries to 0, 0, 1; this is the simplification to the case of affine transformations only that I mentioned. The full 3x3 matrix represents an arbitrary projective transformation. We don't need that. For example, the tutorial mentions representing a scale factor W as [1 0 0 ] [0 1 0 ] [0 0 1/W] which implies that there is a perspective-division step, the X and Y coordinates being divided by the Z (depth) coordinate. I don't know at what stage of the pipeline this happens. We can sidestep the issue by setting W = Z = 1 and representing a scale factor of W using the matrix [W 0 0] [0 W 0] [0 0 1] |
In reply to this post by Eli Zaretskii
On Thu, Jun 13, 2019 at 08:11:21PM +0300, Eli Zaretskii wrote:
> > Date: Thu, 13 Jun 2019 17:58:04 +0100 > > From: Alan Third <[hidden email]> > > Cc: [hidden email] > > > > Can you point me to the Windows API documentation and perhaps I can > > work out how exactly we need to approach this? > > I want to use PlgBlt, see > > https://docs.microsoft.com/en-us/windows/desktop/api/wingdi/nf-wingdi-plgblt > > For that, I need to compute, for each of the original image's > vertices, the coordinates of the corresponding vertex of the > transformed image. That's why I was asking how to interpret the > matrix elements for transforming pixel coordinates. I think this should be what you need: x’ = tm[0][0] * x + tm[0][1] * y + tm[0][2] * 1 y’ = tm[1][0] * x + tm[1][1] * y + tm[1][2] * 1 where tm is the completed transformation matrix. BTW, are you aware that you can use the XFORM struct: https://docs.microsoft.com/en-us/windows/desktop/api/wingdi/ns-wingdi-xform That maps exactly to the matrices, which is one of the reasons I went down this route originally. Almost everything supports them. > I also want to know when no rotation is involved, so that I could use > the existing code (which only supports scaling and will be modified to > support cropping) on older Windows versions that cannot support > rotations (PlgBlt is not available on those platforms). I’d be inclined to just skip image_set_rotation when on a platform that doesn’t support it. Or storing the basic crop and scaling information separately so you don’t have to worry about the matrices at all. Now I see why you want to be able to distinguish between the availability of the different types of transform too. -- Alan Third |
In reply to this post by Richard Copley-2
> From: Richard Copley <[hidden email]>
> Date: Thu, 13 Jun 2019 20:00:49 +0100 > Cc: Alp Aker <[hidden email]>, [hidden email], > Emacs Development <[hidden email]> > > XRender uses homogeneous matrices. In this system a 3x3 projective matrix can > represent any affine transformation as one matrix. The matrix representing a > translation, then a rotation, then a translation, can then be calculated by > multiplying three matrices. (It is in fact again a rotation followed by a single > translation.) Thanks for a useful and detailed description. |
Free forum by Nabble | Edit this page |